Thursday, January 10, 2019

Alibaba acquires Data Artisans?

Data Artisans is the company behind Apache Flink - the European answer to Apache Spark.
According to this news article Alibaba acquires Data Artisans.

I wrote back in 2014 on Apache Flink project.

Friday, December 8, 2017

Apple shares Turi Create open source framework

It is very exciting that after many years of hard work, we have finally released our machine learning framework as open source! The announcement made yesterday at NIPS by Prof. Carlos Guestrin:

And here is our github link:

Friday, September 8, 2017

Prof. Joseph Keshet from BIU fools deep learning

My friend Joseph (Yossi) Keshet have recently released work for fooling deep learning systems. His work got a lot of attendion including MIT Technology Review and the New Scientist. Nice work!!

Dataiku raised 28M$

According to VentureBeat Dataiku just raised 28M$. Dataiku has a web based platform for data science.

Here is my personal connection. Strangely last time I wed a couple I was wearing their t-shirt.

Unrelated, I just learned from my colleague Brian that Cloudera just acquired Fast Forward Labs, which is the company behind Hilary Mason. I visited Hilary in her offices a couple of years ago and learned they had an interesting consulting models of sharing periodical tech reports for educating data scientists to become more proficient. Congrats Hilary!

Monday, September 4, 2017

Deepgram - Audio Search with Deep Learning

A very interesting podcast by Sam Charrington who is interviewing Scott Stephenson from DeepGram. DeepGram is using deep learning activations for creating indexes that allows to search text in voice recordings.

DeepGram have released Kur which is a high level abstraction of deep learning framework to allow quickly defining network layouts. But still, writing the target persona is researchers with deep learning knowledge.

A related Israeli startup is AudioBurst.  They claim to use AI for indexing but it is not clear what they actually do. Another Israeli startup is Verbit. They seem to transcribe audio with humans going over the preliminary result.

In addition, my friend Yishay Carmiel is working on importing parts of Kaldi to TensorFLow. A recent Google developer blog post describes this effort. Yishay is leading a spinoff of Spoken called IntelligentWire who is also searching audio files using deep learning.

Overall it seems that search in audio files using deep learning is getting hotter!

Wednesday, July 26, 2017

Some misc news

I just learned my postdoc roommate Yisong Yue from Caltech released a new interesting paper: Factorized Variational Autoencoders for Modeling Audience Reactions to Movies: a joint work with Disney Research, published @ CVPR 2017.

Another interesting paper: Accelerating Innovation Through Analogy Mining, just received the best paper award at KDD 2017. The paper is by Dafna Shahaf who studied with me at CMU and her student Tom Hope.

An earlier related work of Dafna is a paper for identifying humor in cartoon captions.

Misha Bilenko, formerly from M$, released an source for gradient boosting. It seems to compete with XGBoost with the claim that it supports categorical variables as well. (In GraphLab Create we had an extended XGBoost with categorical variable support).

Sunday, February 12, 2017

Data Science Summit Europe 2017

The 3rd Data Science Summit Europe is coming! This year I am not involved in the organization she it should probably be a better event :-) Save the date - May 29, 2017 in Jerusalem. The date was picked just after O'Reilly Strata London 2017 thus the conference will attract many speakers and guests from abroad.

The keynote speaker is my friend Dr. Ben Lorica, chief scientist of O'Reilly Media and the content organizer for O'Reilly Strata and O'Reilly AI conferences.

Hope to see you there!