Umnoon Binta Ali

Data Scientist at Astha.IT. I graduated with distinction in CSE from North South University, where I specialized in Artificial Intelligence.

My academic journey led me to discover the exciting world of Machine Learning and Deep Learning, which I explored through data-centric projects.

I worked as a Research Assistant at North South University. Focus of my research was to analyzed the limitations of the Bangla Language Model and proposed solutions to improve its performance.

Email  /  CV  /  Bio  /  Github

profile photo
Research and Projects

A Deep Convolutional Neural Network for Bangla Handwritten Numeral Recognition with Data Augmentation
CSE 465, Pattern Recognition, Image Processing
Source Code / Academic Report

I studied the most unbiased and augmented dataset NumtaDB to classify images of isolated Bangla numerals. I experimented with the data by creating data augmentation using keras library and trained a Convolutional Neural Network model and finally compared the result with the state-of-theart ResNet34 model.

Transfer Learning for Speaker Diarization on Bangla Audio Dataset
CSE499, Graduate Dissertation
Source Code / Academic Report

In this research, transfer learning was explored for speaker diarization on our own Bangla audio data. the approach experiments with different clustering algorithms and embedding techniques to reduce Diarization Error Rate, DER for noisy data.

Contrastive Learning for Text-to-Image on Paraphrasing Captions
CSE 498, Research Project
Source Code / Academic Report

In this research, I presented a contrastive learning strategy to for enhancing the quality and semantic consistency of synthetic images by paraphrasing the captions from the CUB dataset.

Outbreak Prediction and Visualization: Covid-19
CSE 299, Junior Design Project
Source Code & Report

Here I used Global Outbreak Alert and Response Network (GOARN) data to predict the time of the highest spread of covid-19. The time-series visualization is done with Power BI.

Tweet Emotion Recognition with Tensorflow
Source Code

I created a recurrent neural network or `RNN` and train it on a tweet emotion dataset from hugging Face, to learn and recognize emotions in tweets. The dataset has thousands of tweets each classified in one of 6 emotions. This is a multi-class classification problem in the natural language processing domain.

Streamlit and Python Data Science Webapp for Motor Vehicle Collisions in NYC
Source Code

I built a basic data science webapp with streamlit analyzing motor vehicle collision in NYC. I used the following libraries: numpy, pandas, pydeck and plotly