About the Book:
Comprehensive Coverage of the Entire Area of
Classification
Research on the problem of
classification tends to be fragmented across such areas as pattern recognition,
database, data mining, and machine learning. Addressing the work of these
different communities in a unified way, Data Classification: Algorithms
and Applications explores the underlying algorithms of classification
as well as applications of classification in a variety of problem domains,
including text, multimedia, social network, and biological data.
This comprehensive book focuses on three primary aspects of data
classification: ·
Methods: The book first describes common techniques
used for classification, including probabilistic methods, decision trees,
rule-based methods, instance-based methods, support vector machine methods, and
neural networks. ·
Domains: The book then examines specific methods used for
data domains such as multimedia, text, time-series, network, discrete sequence,
and uncertain data. It also covers large data sets and data streams due to the
recent importance of the big data paradigm. ·
Variations: The book concludes with insight on variations
of the classification process. It discusses ensembles, rare-class learning,
distance function learning, active learning, visual learning, transfer
learning, and semi-supervised learning as well as evaluation aspects of
classifiers.
Features
·
Integrates
different perspective from the pattern recognition, database, data mining, and
machine learning communities
·
Presents an
overview of the core methods in data classification
·
Covers recent
problem domains, such as graphs and social networks
Discusses advanced methods
for enhancing the quality of the underlying classification results |
Contents:
1. An Introduction to Data
Classification
2. Feature Selection for
Classification: A Review
3. Probabilistic Models for
Classification
4. Decision Trees: Theory and Algorithms
5. Rule-Based Classification
6. Instance-Based Learning: A Survey
7. Support Vector Machines
8. Neural Networks: A Review
9. A Survey of Stream Classification
Algorithms
10. Big Data Classification
11. Text Classification
12. Multimedia Classification
13. Time Series Data Classification
14. Discrete Sequence Classification
15. Collective Classification of
Network Data
16. Uncertain Data Classification
17. Rare Class Learning
18. Distance Metric Learning for Data
Classification
19. Ensemble Learning
20. Semi-Supervised Learning
21. Transfer Learning
22. Active Learning: A Survey
23. Visual Classification
24. Evaluation of Classification
Methods
25. Educational
and Software Resources for Data Classification |
About the Editor:
Charu C. Aggarwal is a research scientist at the IBM T.J.
Watson Research Center in Yorktown Heights, New York. He completed his B. S.
from IIT Kanpur in 1993 and his Ph. D. from Massachusetts Institute of
Technology in 1996. His research interest during his Ph. D. years was in
combinatorial optimization (network flow algorithms), and his thesis advisor
was Professor James B. Orlin. He has since worked in the field of performance
analysis, databases, and data mining. He has published over 200 papers in
refereed conferences and journals, and has applied for or been granted over 80
patents. He is author or editor of ten book. Because of the commercial value of
the aforementioned patents, he has received several invention achievement
awards and has thrice been designated a Master Inventor at IBM. He is a
recipient of an IBM Corporate Award (2003) for his work on bio-terrorist threat
detection in data streams, a recipient of the IBM Outstanding Innovation Award
(2008) for his scientific contributions to privacy technology, a recipient of
the IBM Outstanding Technical Achievement Award (2009) for his work on data
streams, and a recipient of an IBM Research Division Award (2008) for his
contributions to Systems S. He also received the EDBT 2014 Test of Time Award
for his work on condensation-based privacy-preserving data mining.
He
served as an associate editor of the ACM Transactions on Knowledge and Data
Engineering from 2004 to 2008. He is an associate editor of the ACM
Transactions on Knowledge Discovery and Data Mining, an action editor of the
Data Mining and Knowledge Discovery Journal, editor-in-chief of the ACM SIGKDD
Explorations, and an associate editor of the Knowledge and Information Systems
Journal. He serves on the advisory board of the Lecture Notes on Social
networks, a publication by Springer. He serves as the vice-president of the
SIAM Activity Group on Data Mining, which is responsible for all data mining
activities organized by SIAM, including their main data mining conference. He
is a fellow of the IEEE and the ACM, for “contributions to knowledge discovery
and data mining algorithms. |