Blog

Blog Categories

Among the variety of open source relational databases, PostgreSQL is probably one of the most popular due to its functional capacities. That is why it is frequently used among all the areas of work where databases are involved. In this article, we will go through connection and usage of PostgreSQL in R. R is an open source language for statistical and graphics data analysis providing scientists, statisticians, and academics powerful tools...
Accessing and analyzing media content is a fascinating part of data analytics. It allows to follow trends of public interest over time or to see how stories evolve (e.g.  newslens ). While many media outlets offer APIs, it is cumbersome to collect them individually. News API closes that gap and allows to search and retrieve live articles from all over the web.   .caret, .dropup > .btn > .caret {...
Unless you've been living under a rock, you may be familiar with the data science field making waves in different industries. Maybe you saw it while scrolling aimlessly through Facebook, or you've actively trying to look for a job in the field. Either way, data science is an emerging field, making it one of the most sought after jobs in our century, and everyone knows about it. But with jobs comes a hiring process. With hiring comes...
Because of the rapid development of the field and the promising opportunities in the future, many people are interested in entering the world of data science right now. if you have no experience in the field or you are transitioning from a different background, finding your first job in the field of data science can be a daunting task, especially as competition is increasing. This blog gives you six tips to increase your chances of landing...
Overview Lately, I worked a lot with the Azure Cloud. Overall I have to say Azure offers a lot but is still not on the same level as its hardest competitors (AWS, Google). One thing that caught my eye is the compatibility of certain programming languages. Azure supports a few different languages (C#, JavaScript, Java, Python, etc.) but the supported features for these languages differ a lot. I think Azure Cloud is really great for...
  Support Vector Machine (SVM) is a widely used supervised learning algorithm for classification and regression tasks. It is mostly exploited for classification problems. The points of different classes are separated by a hyperplane, and this hyperplane must be chosen in such a way that the distances from it to the nearest data points on each side should be maximal. Support Vector Machine has some advantages. The first one is that SVM...
The first step to land a job as a data scientist is the same as in any other profession: create a compelling CV! Although there are more open positions in the field of data science  than ever before  it is important to have a strong and suitable CV in order to land the job you want. This post gives you nine tips in order to improve your chances to find the data science job you want. 1. Include achievements, not just jobs...
Use your extra time at home (and your data skills) for a good cause: Check out the  Kaggle  COVID-19 Open Research Dataset Challenge. In response to the COVID-19 pandemic, the White House and a coalition of leading research groups have prepared the COVID-19 Open Research Dataset (CORD-19). This dataset is a resource of over 29,000 scholarly articles, including over 13,000 with full text, about COVID-19, SARS-CoV-2, and...
Based on the public interest, media coverage and the general interest in the topic, data science is still on the rise. Google Trends is an excellent tool to analyse the fever curve of the public interest. The screenshot below shows the Google search traffic in the UK for Data Science and Machine Learning over the last 5 years. After years  with a strong rise in interest, the curve has flattened - but is still increasing. We analyze in...
For many machine learning problems with a large number of features or a low number of observations, a linear model tends to overfit and variable selection is tricky. Models that use  shrinkage  such as Lasso and Ridge can improve the prediction accuracy as they reduce the estimation variance while providing an interpretable final model. In this tutorial, we will examine Ridge and Lasso...
   A common and very challenging problem in machine learning is overfitting, and it comes in many different appearances. It is one of the major aspects of training the model. Overfitting occurs when the model is capturing too much noise in the training data set which leads to bad predication accuracy when applying the model to new data. One of the ways to avoid overfitting is regularization technique. In this tutorial, we...
Today, data science specialists are among the most sought-after in the labor market. Being able to find significant insights in a huge amount of information, they help companies and organizations to optimize the structure's work.  The field of data science is rapidly developing and the demand for talent is changing. We use job offerings to analyze the current demand for talent. After performing a  first analysis in 2017 , this...