Just heard very interesting report from Xinghua Lou, a researcher of machine learning in Microsoft Research. Xinghua utilized GraphLab topic modeling for clustering health related documents. This work was reported at the big data innovation summit 2014.
From the KDnuggets blog post about this work:
"Among various techniques for understanding text corpus, we chose LDA topic models (implemented in GraphLab) because of its previous success in understanding scientific literature as well as webpages. We followed a process roughly as follows: data cleaning and standardization, topic modeling, clinical note clustering and visualization, community finding and cancer-gene correlation analysis. This process was mainly implemented by Katherine Chanunder my supervision. We had a few interesting findings, such as a community of patients who highly care about the risk of the treatment, the ability of predicting icd-9 code from topic modeling output, and some interesting correlations between patient profile and genetic mutation tests (some supported by previous published research)."
No comments:
Post a Comment