HSE Big Data Systems Blog

HSE Big Data Systems Blog

Share

Big Data news, events, must-reads as well as HSE programme highlights and insights. Hello! We start our big data systems blog!

Here, we will touch upon technological aspects as well as business cases and solutions. In addition to that, key recent and upcoming events in the Programme at HSE will also be covered. We really hope that this will be interesting for both beginners and professionals who dive deep. We are also running this blog at VK in The Russian language. Those who prefer VKontakte to Facebook are welcome at vk

Photos 13/07/2016

We have noticed a funny phenomenon: the less we write, the more you subscribe :) May it be vice versa?!

Generally, books are always welcome, so today we present you 16 free books on machine learning — from the very baby steps in this field to extremely specific topics that have been familiar to us for so long but still appear to be mysterious,— such as deep learning or neural networks: hackerlists.com/free-machine-learning-books/

For example, what will you say about 100 pages of "introduction" into reinforcement learning problem (an area of machine learning that exploit the environment, typically formulated as a Markov decision process)? Well, how cool it is!!!

So read and dare — and we go out from our impromptu holidays and continue to seek and publish interesting articles for you! 😌

29/06/2016

What are you doing now? If you have a spare time or a coffee break, check 'Yet another Conference on Marketing 2016' online stream: events.yandex.ru/events/yac/29-june-2016/ (there are 2 channels)

Today on air:
- Content marketing in social networks (Leadscanner)
- Hyper-local advertising: exactly in the right place (Yandex+Nielsen)
- The digital revolution in the Mobile Programmatic market (Beeline)
- How to derive maximum benefit from the data you have (Marilyn)
- Building smart advertising campaigns based on user data (ISEE Marketing).. and many others!

Check the full program today and join the broadcast!

ATTN! Working language is Russian.

Yet another Conference on Marketing 2016, 29 июня 2016, Москва — События Яндекса События — вся информация о конференциях, школах, семинарах и других мероприятиях Яндекса — как прошедших, так и предстоящих.

How to read the genome and build a human being | Riccardo Sabatini 27/06/2016

Are you new to bioinformatics? Perhaps, you did not plan to get acquainted with biology-related sciences but, by studying the data science, this is worth doing! Why?? - Well, this application field is quite specific, yet very promising and diverse for a person not afraid of math and programming, and in addition, it is growing rapidly!

Today talk by scientist and entrepreneur Riccardo Sabatini will show you how it comes possible to predict things like height, eye color, age and even facial structure — all from a vial of blood, only by reading the human genome: youtube.com/watch?v=s6rJLXq1Re0.

In fact, tremendous opportunities are open to us: we already have the necessary tools, now it is time to move from theory to practice since investigations are not limited to solely phenotype predictions, and here machine learning will let us progress to another applications and advancements, for example, the area of personalized medicine...

How to read the genome and build a human being | Riccardo Sabatini Secrets, disease and beauty are all written in the human genome, the complete set of genetic instructions needed to build a human being. Now, as scientist an...

Photos 22/06/2016

Hey, guys, what are your plans for tomorrow night? The thing is tomorrow, June 23, in the Moscow office of Yandex the regular Moscow Python Meetup will be held. The program includes three presentations. In addition to traditional (for meetup community) stories about new technologies, they will talk about the concept of open data, and will show what tasks it offers to Pythonists. Since the event is very popular, the registration was closed almost immediately, however, the webcast is going to be held here: youtube.com/watch?v=dMafO5-HsTM

Program of the event:
7 P.M. (UTC+3:00) Telegram Bot API 2.0 - what's new?? Speaker: Bogdan Evstratenko, Yandex
7:40 P.M. (UTC+3:00) Push, or how to discharge a strange phone. Speaker: Alina Baymasheva, Rambler & Co developer
8:20 P.M. (UTC+3:00) What Pythonists can do with Open data? Speaker: Elena Nikitina, Analytical Centre under the Government of the Russian Federation

Get connected and do not miss the fun!

Photos 20/06/2016

Modern Trade Code - a conference on marketing, advertising and creativity, sales, technology, and other current topics in the field of modern commerce. Leading experts will share their unique knowledge and experience, solve cases, and tell what is happening in one of the most important sectors of the Russian economy.

From our point of view, one of the most interesting presentations at this event is "Technology in the Modern Trade. A great discussion about the technological future of industry", which highlights the following points:
- How to analyze customer behavior and influence it: Big Data, iBeacon and other technologies;
- No longer a science fiction: facial recognition, remote control of the visitors movement, augmented reality, neuromarketing and other technologies in the service of Modern trade;
- In Digital we trust: how to increase sales with the introduction of digital technology and mobile solutions.

Major speakers: Yevgeny Drozdov (Commercial Director in OBI), Yuriy Berchenko (Google Russia), Anton Volodkin (Operational Marketing Director in M. Video), Dmitri Mitsuk (Metro Cash & Carry, Head of Brand & Communications), Vsevolod Kuzmich (IT Director in "Lenta").

To get to the Modern Trade Code for free, fill in the registration form until 21 June inclusively: http://changellenge.com/modern-trade-code/

ATTN! The registration closes tomorrow. Working language is Russian.

Details on the venue: Moscow, 23rd of June, Holiday Inn Lesnaya

Hurry up and see you there! ;)

Photos 17/06/2016

Today, we continue to supply you with the very basics of Python to let you feel confident with it and finally dive deep in it :)

Here, an overview of 10 interdisciplinary Python data visualization libraries is provided, from the well-known to the obscure: https://blog.modeanalytics.com/python-data-visualization-libraries/

Scroll through the Python Package Index and you’ll find libraries for practically every data visualization need—from GazeParser for eye movement research to pastalog for realtime visualizations of neural network training. And while many of these libraries are intensely focused on accomplishing a specific task, some can be used no matter what your field.

Such a grasp is usually needed to notice something interesting for your field of studies in order to know,what package look for in the very process of data exploration.

Photos 16/06/2016

If Python is your appetizer, but your relations are strained and, opening github projects, you feel lost like on a first date… then, this tutorial is for you! Start from the beginning, go this way from scratch, and again will you not have to "hover" at the initial stages. Sounds cool, doesn’t it?!

Python - where to click and what is Pandas (check dataset inside!): analyticsvidhya.com/blog/2016/01/complete-tutorial-learn-data-science-python-scratch-2/

Due to lack of basic resource on python for data science, the author, Kunal Jain, decided to create this tutorial to help many others to learn python faster. In this tutorial, you will be taken bite sized information about how to use Python for Data Analysis, so chew it till you are comfortable and practice it at the end.

Our tip: even if today is not the best chance to try this tutorial, save (!) the link, possibly, you'll need it on the eve of the deadline of the project ;)

Photos 15/06/2016

Hurray!!! Researchers at MIT have demonstrated a deep learning algorithm that has effectively learned how to predict sound. When shown a silent video clip of an object being hit, the algorithm can produce a sound for the hit that is realistic enough to fool human viewers. Nevertheless, researchers say that there’s still room to improve the system. For example, if the drumstick moves especially erratically in a video, the algorithm is more likely to miss or hallucinate a false hit. It is also limited by the fact that it applies only to “visually indicated sounds” — sounds that are directly caused by the physical interaction that is being depicted in the video.

“When you run your finger across a wine glass, the sound it makes reflects how much liquid is in it,” says CSAIL PhD student Andrew Owens, who was the lead author on an upcoming paper describing the work. “An algorithm that simulates such sounds can reveal key information about objects’ shapes and material types, as well as the force and motion of their interactions with the world.”

Actually, the advance could lead to better sound effects for film and television and robots with improved understanding of objects in their surroundings.

Check the video and see (hear?) everything with your own eyes: youtu.be/0FW99AQmMc8
Link to original MIT article: news.mit.edu/2016/artificial-intelligence-produces-realistic-sounds-0613

Photos 14/06/2016

EURO 2016 is getting on 🔥!
Today we bring some EURO predictions by several IT-giants – they wouldn’t fascinate you with originality (however, that’s not the statistics feature, is it?) but still are really interesting.

Yahoo and Tumblr have analyzed 20 bln sport blogs since the beginning of the year plus 4 years of statistics and came out with a full prediction of all the matches: vk.cc/5h1SNc

Microsoft using its Bing has predicted the group stage (vk.cc/5h266I) and the play-off round (vk.cc/5h26fH) results. By the way, back in 2014 Microsoft succeeded to predict all the matches outcomes without any mistake…

Goldman Sachs have also prepared a prediction for the match outcomes and the probabilities of advancement as well. No spoilers except saying that our Russian national team has only 2,6% probability of winning the EURO: vk.cc/5h1Tdx

Another pack of predictions by the Belgian Bisnode company – presented as nice infographics! vk.cc/5h1WQD

German analytic agency Blue Yonder, founded by an ex-CERN scientist has predicted a top of probable champions: vk.cc/5h24dU. There are no surprises, and no detailed algorithm description, but the data used covers last 100 years!

Photos 09/06/2016

So today, we speak about official things: in May 2016 the «Big Data: A Report on Algorithmic Systems, Opportunities and Civil Rights» by the White House was published.

It is noticeable that in contrast to the first report (published in May 2014), which discusses on public privacy issues in multiple dimensions, the second one shifts our attention from data collection to processing itself. Big Data is no longer treated as emerging, but to be anticipated. It is here.

The report itself illustrates four areas: credit, employment, higher education, and criminal justice, — considering the specific problem, opportunity, and challenge in each sphere.

But the crucial thing is still that Big Data has taken another important step on the path to the universal recognition of its efficiency. And the road "towards" the full success seems to be much shorter than the route "backwards". And that's great.

Photos 07/06/2016

And today... our favorite column from Bernard Marr on Forbes! His article is dedicated to blockchain technology, since there’s a lot of hype in the air about it at the moment, you probably know: forbes.com/sites/bernardmarr/2016/05/27/how-blockchain-technology-could-change-the-world/

The technology was initially pushed into the headlines several years ago thanks to the virtual currency Bitcoin, and now even majors like IBM and Microsoft have announced services based on blockchain tech. Furthermore, app ecosystems are already evolving, aiming to give businesses the toolsets necessary to get involved, plus Internet of Things does also benefit from this blockchain boom — you can find examples of real projects in the article.

But HOW exactly it works and what are the future potentials — check this post and let us know what you think about this topic. ;)

Photos 06/06/2016

Statistical Natural Language Processing in Python or How To Do Things With Words. And Counters. or Everything I Needed to Know About NLP I learned From Sesame Street: nbviewer.jupyter.org/url/norvig.com/ipython/How%20to%20Do%20Things%20with%20Words.ipynb

Sounds amusing, doesn’t it?! So, let’s check, how NLP is performed in Python on the example of a text file. This project covers practically everything: from the Zipf's law and spelling correction to word segmentation and evaluation results. This example appears to be very useful for the learning process, your course work, or simply as a guide for further studies!

Learn about new functions, check your skills and… do not forget cookies!

Want your business to be the top-listed Computer & Electronics Service in Moscow?
Click here to claim your Sponsored Listing.

Address


Moscow