Merge remote-tracking branch 'vinta/master'

2025-02-18 22:08:36 +00:00 · 2018-04-24 12:15:13 +05:00 · 2018-04-24 12:15:13 +05:00 · d5d948d302
commit d5d948d302
parent d56c87f6b7 692e9bfe8e
4 changed files with 13 additions and 11 deletions
--- a/13
+++ b/13
@ -1,15 +1,12 @@
-BASEDIR=$(CURDIR)
-DOCDIR=$(BASEDIR)/docs
-
-install:
+site_install:
 	pip install mkdocs==0.16.3
 	pip install mkdocs-material==1.12.2

-link:
-	ln -sf $(BASEDIR)/README.md $(DOCDIR)/index.md
+site_link:
+	ln -sf $(CURDIR)/README.md $(CURDIR)/docs/index.md

-preview: link
+site_preview: site_link
 	mkdocs serve

-deploy: link
+site_deploy: site_link
 	mkdocs gh-deploy --clean
--- a/README.md
+++ b/README.md
@ -85,7 +85,7 @@ Inspired by [awesome-php](https://github.com/ziadoz/awesome-php).
    - [URL Manipulation](#url-manipulation)
    - [Video](#video)
    - [Web Content Extracting](#web-content-extracting)
-    - [Web Crawling](#web-crawling)
+    - [Web Crawling & Web Scraping](#web-crawling--web-scraping)
    - [Web Frameworks](#web-frameworks)
    - [WebSocket](#websocket)
    - [WSGI Servers](#wsgi-servers)
@ -342,6 +342,7 @@ Inspired by [awesome-php](https://github.com/ziadoz/awesome-php).
 * [Open Mining](https://github.com/mining/mining) - Business Intelligence (BI) in Pandas interface.
 * [Orange](https://orange.biolab.si/) - Data mining, data visualization, analysis and machine learning through visual programming or scripts.
 * [Pandas](http://pandas.pydata.org/) - A library providing high-performance, easy-to-use data structures and data analysis tools.
+* [Optimus](https://github.com/ironmussa/Optimus) - Cleansing, pre-processing, feature engineering, exploratory data analysis and easy Machine Learning with a PySpark backend. 

 ## Data Validation

@ -729,6 +730,7 @@ Inspired by [awesome-php](https://github.com/ziadoz/awesome-php).

 * [bpython](https://github.com/bpython/bpython) - A fancy interface to the Python interpreter.
 * [Jupyter Notebook (IPython)](https://jupyter.org) - A rich toolkit to help you make the most out of using Python interactively.
+    * [awesome-jupyter](https://github.com/markusschanta/awesome-jupyter)
 * [ptpython](https://github.com/jonathanslenders/ptpython) - Advanced Python REPL built on top of the [python-prompt-toolkit](https://github.com/jonathanslenders/python-prompt-toolkit).

 ## Internationalization
@ -815,6 +817,7 @@ Inspired by [awesome-php](https://github.com/ziadoz/awesome-php).
 * [SnowNLP](https://github.com/isnowfy/snownlp) - A library for processing Chinese text.
 * [spaCy](https://spacy.io/) - A library for industrial-strength natural language processing in Python and Cython.
 * [TextBlob](https://github.com/sloria/TextBlob) - Providing a consistent API for diving into common NLP tasks.
+* [PyTorch-NLP](https://github.com/PetrochukM/PyTorch-NLP) - A toolkit enabling rapid deep learning NLP prototyping for research.

 ## Network Virtualization

@ -1200,9 +1203,9 @@ Inspired by [awesome-php](https://github.com/ziadoz/awesome-php).
 * [textract](https://github.com/deanmalmgren/textract) - Extract text from any document, Word, PowerPoint, PDFs, etc.
 * [toapi](https://github.com/gaojiuli/toapi) - Every web site provides APIs.

-## Web Crawling
+## Web Crawling & Web Scraping

-*Libraries for scraping websites.*
+*Libraries to automate data extraction from websites.*

 * [cola](https://github.com/chineking/cola) - A distributed crawling framework.
 * [Demiurge](https://github.com/matiasb/demiurge) - PyQuery-based scraping micro-framework.
--- a/docs/css/extra.css
+++ b/docs/css/extra.css
--- a/mkdocs.yml
+++ b/mkdocs.yml
@ -21,5 +21,7 @@ extra:
 google_analytics:
  - 'UA-510626-7'
  - 'auto'
+extra_css:
+    - css/extra.css
 pages:
  - "Life is short, you need Python.": "index.md"