Bibliography

The following is a non-exhaustive list of the resources that have been reviewed along

Books

Bird, Steven; Klein, Ewan; Loper, Edward. Natural Language Processing with Python. O'Reilly, 2009.
Dasadia, Cyrus; Nayak, Amol. MongoDB Cookbook (Second Edition). Packt Publishing, 2016.
Garg, Nishant. Learning Apache Kafka (Second Edition). Packt Publishing, 2015.
Goasguen, Sébastien. Docker Cookbook. O'Reilly, 2016.
Goasguen, Sébastien. Docker in the Cloud: recipes for AWS, Azure, Google, and More. O'Reilly, 2016.
Grus, Joel. Data Science from Scratch. O'Reilly, 2015.
Haloi, Saurav. Apache ZooKeeper Essentials: a fast-paced guide to using Apache ZooKeeper to coordinate services in distributed systems. Packt Publishing, 2015.
Hardeniya, Nitin. NLTK Essentials: build cool NLP and machine learning applications using NLTK and other Python libraries. Packt Publishing, 2015.
Junqueire, Flavio; Reed, Benjamin. ZooKeeper. O'Reilly, 2014.
Karau, Holden. Learning Spark. O'Reilly, 2015.
Kleppmann, Martin. Making Sense of Stream Processing. O'Reilly, 2016.
Lawson, Richard. Web Scraping with Python. Packt Publishing, 2015.
Leskovec, Jure; Anand, Rajaraman; Ullman, Jeffrey D. Mining of massive datasets. Stanford University, 2014.
Lin, Jimmy; Dyer, Chris. Data-intensive text processing with MapReduce. University of Maryland, College Park, 2010.
López, Félix; Romero, Víctor. Mastering Python Regular Expressions. Packt Publishing, 2014.
Makice, Kevin. Twitter API: up and running. O'Reilly, 2009.
Manivannan, Arum. Scala Data Analysis Cookbook. Packt Publishing, 2015.
Matthias, Karl; Kane, Sean P. Docker: up and running. O'Reilly, 2015.
McKendrick, Russ. Extending Docker. Packt Publishing, 2016.
Narębski, Jakub. Mastering Git. Packt Publishing, 2016.
Narkhede, Neha; Shapira, Gwen; Palino, Todd. Kafka: the definitive guide. O'Reilly, 2016.
Nicolas, Patrick R. Scala for Machine Learning. Packt Publishing, 2014.
Parsian, Mahmoud. Data Algorithms. O'Reilly, 2015.
Perkins, Jacob. Python Text Processing with NLTK 2.0 Cookbook. Packt Publishing, 2010.
Russell, Matthew A. 21 Recipes for Mining Twitter. O'Reilly, 2011.
Settles, Burr. Active Learning Literature Survey. University of Wisconsin-Madison, 2010.
Yadav, Rishi. Sparck Cookbook. Packt Publishing, 2015.

Online articles and resources

Scholarly articles

Andor, Daniel [et al.]. Globally Normalized Transition-Based Neural Networks. Google Inc., 2016.
Breiman, Leo. Bagging Predictions. Department of Statistics, University of California Berkeley, 1994.
Beygelzimer, Alina; Kale, Satyen; Luo, Haipeng. Optimal and Adaptive Algorithms for Online Boosting. Arxiv.org, 2015.
Blei, David M. Surveying a suite of algorithms that offer a solution to managing large document archives. Communications of the AMC, 2012.
Chen, Tianqi; Guestrin, Carlos. XGBoost: A Scalable Tree Boosting System. arxiv.org, 2016.
Gu, Xiaodong [et al.]. Deep API Learning. arxiv.org, 2016.
Liu, Bing [et al.]. Partially Supervised Classification of Text Documents. Singapore-MIT Alliance, 2002.
Mihalcea, Rada; Tarau, Paul. TextRank: Bringing Order Into Texts. Conference on Empirical Methods in Natural Language Processing, 2004.
Morstratter, Fred [et al.]. Is the Sample Good Enough? Comparing Datafrom Twitter’s Streaming API with Twitter’s Firehose. Proceedings of the Seventh International AAAI Conference on Weblogs and Social Media, 2013.
Niu, Gang. Theoretical Comparisons of Positive-Unlabeled Learning against Positive-Negative Learning. arxiv.org, 2016.
Qin, Xiangju [et al.]. Learning from data streams with only positive and unlabeled data. Journal of Intelligent Information Systems, 2013.
Sill, Joseph [et al.]. Feature-Weighted Linear Stacking. arxiv.org, 2009.
Tao, Ke [et al.]. Groundhog Day: Near-Duplicate Detection on Twitter. Proceedings of the 22nd international conference on World Wide Web, 2013.
Teh, Yee Whye [et al.]. Hierarchical Dirichlet Processes. University of California Berkeley, 2005.

M'interessa Learning by choosing

Books

Online articles and resources

Scholarly articles