Blog

Blog Categories

What is Exploratory Data Analysis Exploratory data analysis (EDA) is a powerful tool for a comprehensive study of the available information providing answers to basic data analysis questions. What distinguishes it from traditional analysis based on testing a priori hypothesis is that EDA makes it possible to detect — by using various methods — all potential systematic correlations in the...
In the modern world, the information flow which befalls on a person is daunting. This led to a rather abrupt change in the basic principles of data perception. Therefore visualization is becoming the main tool for presenting information. With the help of visualization, information is presented to the audience in a more accessible, clear, visual form. Properly chosen method of visualization can make it possible to structure large data arrays,...
The more carefully you process the data and go into details, the more valuable information you can get for your benefit. Data visualization is an efficient and handy tool for gaining insights from data. Moreover, you can make the data far more understandable, colorful and pleasant with the help of visualization tools. As data is changing every second, it is an urgent task to investigate it carefully and get the insights as fast as...
Companies have a growing demand to visualize their data with business intelligence tools. We compared the salaries from across 10 different European countries using   Glassdoor , which offers self-reported salary information by location and employer, giving us some key insights into the salaries of people with “Business Intelligence” in their job title. Switzerland with the highest salary for Business Intelligence There...
In financial markets, tradable instruments and securities have unique identifiers. The identifiers are very useful, because you can make sure that you and your counterparty are talking about the same instrument while trading. The difficulty is that there isn't really a standard for all the various sorts of instruments or markets. Anyone working in the industry will recognize this issue, especially people working at larger institutions who...
The random forest algorithm is the combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. It can be applied to different machine learning tasks, in particular, classification and regression. Random Forest uses an ensemble of decision trees as a basis and therefore has all advantages of decision trees, such as high accuracy,...
   .caret, .dropup > .btn > .caret { border-top-color: #000 !important; } .label { border: 1px solid #000; } .table { border-collapse: collapse !important; } .table td, .table th { background-color: #fff !important; } .table-bordered th, .table-bordered td { border: 1px solid #ddd !important; } } @font-face { font-family: 'Glyphicons Halflings'; src: url('../components/bootstrap/fonts/glyphicons-halflings-regular.eot');...
Since the dawn of the digital age, the amount of data stored on servers has risen dramatically. More and more firms are looking for talent that can handle their datasets and generate insights for business decisions. Data scientists are among the most popular for this task. Google Trends shows that the global volume of the search term “Data Scientist” has tripled over the last 5 years - but how does the increasing demand translate...
In this Jupyter Notebook we will retrieve data from the European Central Bank (ECB). The ECB publishes through the European Open Data Portal, which we discussed in the previous tutorial . Before diving into the code, please take a quick look at the following websites, to get a feel for what we will be dealing with. EU portal: https://data.europa.eu/euodp/en/data/publisher/ecb ECB SDMX 2.1 RESTful web service:...
  The EU Open Data Portal gives access to open data published by EU institutions, agencies and other bodies. Around 70 EU institutions, bodies or departments use the platform to make over 12,500 datasets available. In this Jupyter Notebook we will retrieve data from open data portal " http://data.europa.eu/euodp/en/home ". The portal is based on the open source project CKAN. CKAN stands for Comprehensive...
Since the dawn of the digital age, the amount of data stored on servers has risen dramatically. With this increase, more and more firms are looking for talent that can handle their datasets and generate insights for business decisions. Google Trends shows that the global volume of the search term “Data Analyst” nearly tripled over the last 5 years. How does the increasing demand translate into earnings of data analysts in...
  Google became the main starting point for our online activities. The search engine processes about 40,000 searches every second or 3.5 billion searches per day. It records what people are interested in, what they worry about or where they want to travel. In a unique manner, the search engine captures trends in interests and behavior. Hidden racisms, sexual orientation or ad returns - check out the work by Seth...