Ashok Srivastava

Member since: Jan 05, 2014, Verizon

Sparse Machine Learning Methods for Understanding Large Text Corpora

Shared by Ashok Srivastava, updated on Jan 27, 2012

Summary

Author(s) :: Laurent El Ghaoui, Guan-Cheng Li, Viet-An Duong, Ashok Srivastava, Kanishka Bhaduri

Abstract

Sparse machine learning has recently
emerged as powerful tool to obtain models of high-dimensional data with high degree of interpretability, at low computational cost. This paper posits that these methods can be extremely useful for understanding large collections of text documents, without requiring user expertise in machine learning. Our approach relies on three main ingredients: (a) multi-document text summarization and (b) comparative summarization of two corpora, both using
parse regression or classification; (c) sparse principal components and sparse graphical models for unsupervised analysis and visualization of large text
corpora. We validate our approach using a corpus of Aviation Safety Reporting System (ASRS) reports and demonstrate that the methods can reveal causal and contributing factors in runway incursions. Furthermore, we show that the methods automatically discover four main tasks that pilots perform during
flight, which can aid in further understanding the causal and contributing factors to runway incursions and other drivers for aviation safety incidents.

Citation: L. El Ghaoui, G. C. Li, V. Duong, V. Pham, A. N. Srivastava, and K. Bhaduri, “Sparse Machine Learning Methods for Understanding Large Text Corpora,” Proceedings of the Conference on Intelligent Data Understanding, 2011.

show more info

Publication Name: N/A
Publication Location: N/A
Year Published: N/A

Files

cidu2011-dashlink.pdf

5.3 MB

38 downloads

Discussions

Add New Comment

Ashok's Projects (16)

Sample Flight Data
16 members

37 resources
PSU-AMES FOQA Project
4 members

7 resources
Vehicle Level Reasoning System-VLRS
5 members
Ames-ICAT/MIT
3 members

6 resources

see all

Need help?

Visit our help center