- Various Data Science Scripts
- Emotion Recognition in Movie Synopses
- Classification and Prediction of Images of Handwritten Digits in the Kannada Language
- Spotify Songs: Popularity & Genres
- Exploratory Data Analysis: Google Play Store Apps
- Search Engine Optimization (SEO) Handbook
- What about this Web Page?
A collection of coding scripts and notes with reference to a series of Data Science subjects and programming concepts and foundations, such as: algorithms and data structures, object-oriented and functional programming, unit testing, virtual environment, databases, shell scripting, and mathematics for Machine Learning in Python.
This repository also includes various notes with regard to Amazon Web Services (AWS), and Augmented and Virtual Reality based on Coursera specialization courses, as well as several sources with respect to APIs and Containers.
The aim of the construction of this repository, which is actively maintained and developed, is mainly for personal coding documentation and guideline of various programming concepts and sections around Data Science.
What makes a movie e.g. suspenseful, dramatic, or sci-fi, and which words hidden in movie plots can predefine this? The answer to that would help extract significant information from narrative texts and build an automatic system that could produce emotional tags.
This project dissertation was submitted in partial fulfillment of requirements for the degree of MSc Information Management at Strathclyde University (August 2020). It was marked with distinction, and it has received citations from other academic papers based on mention reports at academia.edu.
The aim was to identify, define, and automatically predict a set of emotions in movies based on their movie abstracts and metadata. This was achieved with a series of steps including data collection and cleaning, descriptive and inferential statistics, data preprocessing, and Machine & Deep Learning tools and models with a core element and focus on Natural Language Processing (NLP).
The problem was treated as a multi-label classification one, and the result was the prediction of emotions in 55,577 unlabelled movies, as well as the identification of correlations between the predicted emotions and users' ratings and preferences.
Overall, various correlation tests indicated a strong relationship between users' watchlist and their respective emotional tags. It was concluded that the notion of emotion can constitute an important feature in the movie industry with regard to recommender systems and advertising companies for generating, finding, and placing a higher level of personalized content.
This is a Machine Learning project using Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs) with TensorFlow and Keras, and more specifically, it is about a classification problem in regard to the recognition and prediction of the Kannada hand-written digits.
The dataset used is an alternative one with reference to the popular MNIST digits database, and the project was constructed in the context of a kaggle class competition (module CS985 Machine Learning) at Strathclyde University (academic year 2019-20).
A range of deep neural network architectures and techniques are applied with the final goal of finding the optimal model for the hand-written digits' predictions.
An exploratory data analysis is undertaken with reference to the popularity of songs derived from the last 7 decades (1950 - 2010) with various regression and classification problems.
The original source of the data comes from this kaggle repository. Consequently, this data was modified (e.g. merging the CSV files provided) in such a way for the construction of a class competition at Strathclyde University in the module of Machine Learning (academic year 2019-20), leading to two problems: a regression and a classification problem.
The final purpose was the identification of the most important attributes that the most popular songs share and have in common.
In the regression problem, various models were built in order to predict the popularity of songs, whereas the classification problem led to the construction of models that predicted the genre of the respective songs.
This is an Exploratory Data Analysis of the dataset: "Google Play Store Apps" collected from this kaggle repository.
It contains descriptive statistics, data analysis, and data visualization of the Google Play Store apps from the above data repository. The analysis undertaken was in the context of the "Big Data" module of Strathclyde University (Glasgow, UK) in the academic year 2019-20.
The results and the diagnostic analysis of this work, using both supervised (classification) and unsupervised methods (clustering), led to insights that can be useful for identifying which attributes are linked with the most popular apps in Google Play Store.
This is a research study on Search Engine Optimization (SEO) in the context of Information Retrieval.
It begins with understanding the basic components of Recommender Systems, such as information behaviour and indexing, description vs. discrimination on document results, and Recommender Systems frameworks and evaluation metrics such as PageRank, Precision, and Recall.
Subsequently, an in-depth analysis of how a website can be search engine optimized is undertaken with a focus on organic search results. More specifically, by understanding how search ranking results work, I demonstrated how to conduct a keyword research plan, the ways that content can be optimized, and lastly, how to perform a successful SEO evaluation.
Finally, I wrote an article about this topic which was published in various developer communities as well as at LinkedIn Pulse that you can find and read it here.
I have created this personal portfolio web page keeping it open-source on Github.