Resources for Machine Learning

Interested in learning more about machine learning? Below are some very helpful resources. Frameworks will allow you to easily run common machine learning algorithims without having to write them from scratch. The databases are great for getting the large scale data required for machine learning.


Top Frameworks


Tensorflow

TensorFlow is a free and open-source software library for dataflow and differentiable programming across a range of tasks. It is a symbolic math library, and is also used for machine learning applications such as neural networks.

ML.NET

ML.NET is a free software machine learning library for the C# and F# programming languages. It also supports Python models when used together with NimbusML.

Scikit-learn

Scikit-learn is a free software machine learning library for the Python programming language. It features various classification, regression and clustering algorithms.

Weka

Waikato Environment for Knowledge Analysis, developed at the University of Waikato, New Zealand. It is free software licensed under the GNU General Public License.

Amazon Machine Learning

Amazon Machine Learning is a service that provides users tools to create machine learning applications. Powerful algorithms from machine learning are used by Amazon for finding patterns in data to provide accurate data predictions.

Torch

Torch is an open-source machine learning library, a scientific computing framework, and a script language based on the Lua programming language. It provides a wide range of algorithms for deep learning, and uses the scripting language LuaJIT, and an underlying C implementation.


Top Databases


FiveThirtyEight

This is site for data based on political opinions and culture, as well as statistical data covering sports, science, economics, and elections.

Kaggle

Kaggle, a subsidiary of Google LLC, is an online community of data scientists and machine learning practitioners.

Data.gov

Data.gov is a U.S. government website launched in late May 2009 by the then Federal Chief Information Officer of the United States, Vivek Kundra. Data.gov aims to improve public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government.

GroupLens

GroupLens Research has collected and made available several datasets. Choose the one you are interested in from the menu on the right. Before using these data sets, please review their README files for the usage licenses and other details.

Climate Data Online

Climate Data Online (CDO) provides free access to NCDC's archive of global historical weather and climate data in addition to station history information.

Google Cloud Public Datasets

A public dataset is any dataset that is stored in BigQuery and made available to the general public through the Google Cloud Public Dataset Program. The public datasets are datasets that BigQuery hosts for you to access and integrate into your applications.