• Categories

## Rough IT notes of no use for others

title: “Focused Learning compilation”
author: “Gopal Kumar”
date: “18 August 2015”
output: html_document

# Learning MDown, pandoc, knitr, xeletex, R, postgreSQL, QGIS, Python, Andoridapps, CynagenMod
1. EMACS24, Markdown, pandoc, git, Hg, XeLeTeX, beamer, Impress.js,
2. R, knitr, statistics, Machine learning, Bigdata, Hadoop, SQL/Hive
3. SQL, postgreSQL, remotesensing, QGIS, RGIS
4. Python, Cpp, Java, JavaScript, Joomla, LAMPS[1],
5. Android Apps, CynagenMod, GSM4g
6. Linux, networking, KaliLinux, TCP/IP, linuxfromscretch, Cisco router program, switch, dns bind8,
7. IndustrialIP, Openbus Modbus, RespberryPi, Ardeno, Automation, Artificial intelligence
8. IndusrialDrives, Rotating Machines, sensors and actuators,, linear Motors, IGBT, IGCT,
9. Misc, HAMcode, make, scriptprogram
10. LibreOffice, Apache Server, ERP, webERP, webcollab
12. Nature, advanture, travel, sustainable devlopment, law, history , geography, space, physics,

## Good sites with my login http://vk.com/datascience
## Basic UNIX commands http://www.datasciencecentral.com/group/resources/forum/topics/data-science-cheat-sheet
## install cygwin and then
You don’t need to spend hours learning UNIX Cygwin console:
cd, pwd, ls , tail -100, head -150 , cp, mv, mkdir, rmdir, wc, cat,
grep:
sort, uniq (sort alphabetically or numerically option)
gzip: compress/uncompress files
wc:
grep:
chmod:
history:
cron, crontab: schedule tasks (running an executable once a day)

> , >> to append , | (the pipe ), & (see section 2, used for background or batch mode ), * (see examples) and ! (see example)

## Book read and has some data http://www.win-vector.com/blog/introduction-to-data-science/
##github ggmtechn63

## Some sites
– datacentral <http://www.datasciencecentral.com/profiles/blogs/20-data-science-r-python-excel-and-machine-learning-cheat-sheets&gt;
– Large data sets for use <https://www.quandl.com/search&gt;
– Historical data visulaisation <http://101.datascience.community/&gt;
– some massive data mining softwares with more links site <http://www.kdnuggets.com/&gt;
Data Science
Data Science Cheat Sheet – Basic
Data Science Cheat Sheet – Advanced
Getting Started Apache Hadoop Reference Card
Working with HDFS from the command line – Hadoop Cheat sheet
R
R functions for Regression Analysis
R functions for Time series Analysis
R Cheat Sheet
Data Visualization with R
Data Analysis the data.table way
Data Visualisation with ggplot2 cheatsheet by R studio
Python
Python 2.7 Quick Reference Sheet
Python Cheat Sheet by DaveChild
Python Basics Reference sheet
NumPy / SciPy / Pandas Cheat Sheet

Machine Learning

Choosing the right estimator Machine Learning cheatsheet
Patterns for Predictive learning cheatsheet
Machine learning algorithm cheat sheet for Microsoft Azure
Machine Learning cheatsheet Github 1
Machine Learning cheatsheet Github 2
Machine Learning which algorithm performs best?
Cheat sheet 10 machine learning algorithms R commands

Things a Linux user must learn
Learn bash: Just read the complete man page of bash (man bash).
Learn vim: You might be using Emacs or Eclipse for your work all the time but nothing can compete vim.
Learn ssh: Learn the basics of passwordless authentication.
Learn basics of bash job management: Using &, Ctrl-C, fg, bg, Ctrl-Z, jobs, kill.
Learn basic commands for file management: ls and ls -l, less, head, tail and tail -f, ln and ln -s (hard links and soft links), chown, mount, chmod, df, du (du -sk *).
Learn basic commands for network management: dig, ifconfig.
Learn how to use grep, find and sed.
Learn how to use aptitude or yum (depends on the distro) to find and install packages.

For daily use
In bash, you may use Ctrl+R to search in command history.
In bash, you may use Ctrl+W to delete the last word, and Ctrl+U to delete the complete line.
Use cd – command to go back to the previous working directory.
Learn how to use xargs.

$find . -name \*.py | xargs grep some_function$ cat hosts | xargs -I{} ssh root@{} hostnameX

Use pstree -p command to get see the process tree. Learn various signals. eg to suspend a process, use kill -STOP [pid]. Type man 7 signal in terminal for complete guide.
If you want to keep running a background process forever then you can use nohup or disown.
Use netstat -lntp command to see what the processes are listening.
You should check about lsof also.
In your bash script you can use subshells to group commands.

# Do something in current dir

(cd /some/other/dir; other-command)

# Continue in original dir

Trimming of strings: ${var%suffix} and${var#prefix}. For example if var=foo.pdf, then echo ${var%.pdf}.txt prints “foo.txt”. The output of a command can be treated like a file via <(some command). For example, compare local /etc/hosts with a remote one: diff /etc/hosts <(ssh somehost cat /etc/hosts) Know about “here documents” in bash. Learn how to redirect both standard output and standard error via: some-command >logfile 2>&1. You should know about ASCII table (with hex and decimal values). Type man ascii in terminal. While working remotely via ssh, you should use screen or dtach to save your session. For web deveopers use of curl and curl -I, wget etc is useful. To convert HTML page to text file: lynx -dump -stdin If you must handle XML, xmlstarlet is good. In ssh, learn how to port tunnel with -L or -D (and occasionally -R). Also learn how to access web sites from a remote server. If you were typing a command but then changed your mind, Press Alt+shift+3. It will add # at the beginning and enter it as a comment. Data processing Learn about sort and uniq. Learn about cut, paste, and join. Learn how to get union, intersection and difference of text files. cat a b | sort | uniq > c # c is a union b cat a b | sort | uniq -d > c # c is a intersect b cat a b b | sort | uniq -u > c # c is set difference a – b Summing all numbers in the second column of a text file, code given below is probably 3X faster and 3X shorter than equivalent Python. awk ‘{ x +=$2 } END { print x }’

Learn about strings and grep command.
To split files into different parts learn about split (to split by size) and csplit (to split by a pattern).

System debugging

To know the status of your disk, cpu or network use iostat, netstat, top (or the better htop), and (especially) dstat.
To know your system’s memory status use free and vmstat command.
Use mtr which is a network diagnostic tool.
To find out which process or socket is using bandwidth, try iftop or nethogs.
You may use ab tool which is helpful for quick checking of web server performance.
For more serious network debugging take use of wireshark or tshark.
Learn how to use strace, and that you can strace a running process (with -p). This is helpful if your program is failing, hanging, or crashing, and you don’t know why.
Use the ldd command to check shared libraries.
Learn how to connect to a running process with gdb and get its stack traces.
Knowledge of /proc is very helpful. Examples: /proc/cpuinfo, /proc/xxx/smaps, /proc/xxx/exe, /proc/xxx/cwd, /proc/xxx/fd/.
When debugging why something went wrong in the past? To know about this use the sar command. It collects, reports and saves system activity information.
## EMACS24

OpenDir CX d, newfile CX-CF, save S, save as CW, Quit C, new window below CX 2, sidebyside CX 3, insertfile CX-i
screen CV MV
C bf np M bf np
select Cspc cut paste CY
#Short summary of important commands
***This is short summary of markdown commands***
————
*******
<gopalkumar@email.com>
# Rmarkdown shortnotes

—– or ********
~~strikeout~~
**bold** and *italics*
– itemised list
1. enumerate
<email>
# H1
## H2
###### H6

Alternatively,
Alt-H1
======

Alt-H2
——

[I’m a reference-style link][Arbitrary case-insensitive reference text]

[I’m a relative reference to a repository file](../blob/master/LICENSE)

[You can use numbers for reference-style link definitions][1]

Or leave it empty and use the [link text itself]

[arbitrary case-insensitive reference text]: https://www.mozilla.org
[1]: http://slashdot.org
Here’s our logo (hover to see the title text):

Inline-style:
![alt text](https://github.com/adam-p/markdown-here/raw/master/src/common/images/icon48.png “Logo Title Text 1”)

Reference-style:
![alt text][logo]

[logo]: https://github.com/adam-p/markdown-here/raw/master/src/common/images/icon48.png “Logo Title Text 2”
Markdowns highlighting many languages (and not-really-languages, like diffs and HTTP headers); see the highlight.js demo page.

Inline code has back-ticks around it.

Inline code has back-ticks around it.

Blocks of code enclosed by three back-ticks “
“javascript
var s = “JavaScript syntax highlighting”;
“

“python # If not indicated , no syntex highlight but can use html tags <b> tag </b>
s = “Python syntax highlighting”
print s
“

“
No language indicated, so no syntax highlighting.
But let’s throw in a <b>tag</b>.
“

var s = “JavaScript syntax highlighting”;

s = “Python syntax highlighting”
print s
**Tables
The outer pipes (|) are optional, and you don’no need to align raw Markdown line up. You can also use inline Markdown.

Markdown | Less | Pretty
— | — | —
*Still* | renders | **nicely**
1 | 2 | 3

**block quotes
> Blockquotes are very handy in email to emulate reply text.
> This line is part of the same quote. cascade also

Quote break.

**You can also use raw HTML in your Markdown, and it’ll mostly work pretty well.

Horizontal Rule

Three or more…

Hyphens

***

Asterisks

___

Underscores

Three or more…

Insert imagelinks in pure Markdown, but losing the image sizing and border:

—>

Lists Unordered You may use any of the following symbols to denote bullets for each list item:

* valid bullet
– valid bullet nested inside
+ valid bullet
Inline code

Wrap inline snippets of code with .

For example, <section></section> should be wrapped as “inline”.

For example, <section></section> should be wrapped as “inline”.

Indented code

Or indent several lines of code by at least four spaces, as in:
line 1 of code
line 2 of code
line 3 of code

Block code “fences”

Use “fences” “ to block in multiple lines of code.

“ html
Sample text here…
“

Sample text here…

HTML:

<pre>
<p>Sample text here…</p>
</pre>

Syntax highlighting

GFM, or “GitHub Flavored Markdown” also supports syntax highlighting. To activate it, simply add the file extension of the language you want to use directly after the first code “fence”, js, and syntax highlighting will automatically be applied in the rendered HTML. For example, to apply syntax highlighting to JavaScript code:

“ javascript
grunt.initConfig({
assemble: {
options: {
assets: ‘docs/assets’,
data: ‘src/data/*.{json,yml}’,
helpers: ‘src/custom-helpers.js’,
partials: [‘src/partials/**/*.{hbs,md}’]
},
pages: {
options: {
layout: ‘default.hbs’
},
files: {
‘./’: [‘src/templates/pages/index.hbs’]
}
}
}
};
“

Renders to this complicated HTML:
Tables

Tables by adding pipes as dividers and by adding a line of dashes, separated by bars beneath the header.
Pipe Need not vertically aligned.

| Option | Description |
| —— | ———– |
| data | path to data files to supply the data that will be passed into templates. |
| engine | engine to be used for processing templates. Handlebars is the default. |
| ext | extension to be used for dest files. |
Align: colon on one side of dashes below any heading will align text for that column.

| Option | Description |
| ——:| ———–:|
| data | path to data files to supply the data that will be passed into templates. |
| engine | engine to be used for processing templates. Handlebars is the default. |
| ext | extension to be used for dest files. |

Option Description
data path to data files to supply the data that will be passed into templates.
engine engine to be used for processing templates. Handlebars is the default.
ext extension to be used for dest files.

[Upstage](https://github.com/upstage/ “Visit Upstage!”)

Renders to (hover over the link, there should be a tooltip):
Upstage HTML: <a href=”https://github.com/upstage/&#8221; title=”Visit Upstage!”>Upstage</a>

Named Anchors: enable you to jump to the specified anchor point on the same page. eg

* [Chapter 1](#chapter-1)
* [Chapter 2](#chapter-2)
* [Chapter 3](#chapter-3)

## Chapter 1 <a id=”chapter-1″></a>
Content for chapter one.

## Chapter 2 <a id=”chapter-2″></a>
Content for chapter one.

## Chapter 3 <a id=”chapter-3″></a>
Content for chapter one.

NOTE that specific placement of the anchor tag seems to be arbitrary.
They are placed inline here since it seems to be unobtrusive, and it works.

##Footnotes
Type marker [^1]
Type the footnote key at the end of a long document.
[^1]: Cupcake Ipsum is fun text.

[^2]: [Cupcake Ipsum](http://www.cupcakeipsum.com/#)
##Images

Images have a similar syntax to links but include a preceding exclamation point.

![Minion](http://octodex.github.com/images/minion.png)

or

![Alt text](http://octodex.github.com/images/stormtroopocat.jpg “The Stormtroopocat”)

Like links, Images also have a footnote style syntax

![Alt text][id]

With a reference later in the document defining the URL location:

[id]: http://octodex.github.com/images/dojocat.jpg “The Dojocat”

The above cheatsheet noted from http://assemble.io/docs/Cheatsheet-Markdown.html
(the site about static blog genration?)
**Write and publish a book**
Detailed writeup is very good at http://www.aristeia.com/authorAdvice.html
****
** Resource for Android**
Good resource and directions at http://wiki.cyanogenmod.org/w/Doc:_Development_Resources
**Ditch the MS word**
http://inundata.org/2012/12/04/how-to-ditch-word/
Softwares
Pandoc – format convertor
Mendeley refernce manager, export to bib
Markdown editor
Knitr to insert data tables

citations
cite this reference, add it in like so:
some statement [@Costello2009].
statement with multiple citations [@Costello2009; @Costello2010].
Compile With citations:
pandoc document.md -o document.pdf –bibliography citations.bib
With Formatting specific for a journal?
Download the citation styles from here and drop it into your folder. Then specify that style during document generation:
pandoc document.md -o document.pdf –bibliography cite.bib –csl style.csl
Do lot more like adding in results, tables, figures, and equations using mathjax but I’ll save the more advanced stuff for a future post.
———–
Usage: make [options] [target] …
Options:
–always-make Unconditionally make all targets.
–directory=DIRECTORY
–file=FILE, –makefile=FILE
–include-dir=DIRECTORY Search DIRECTORY for included makefiles.
–keep-going Keep going when some targets can’t be made.
–print-data-base Print make’s internal database.
———————–
****
Very low-level like the kernel, libc (aka bionic), and many Linux-ish parts in C.
Low leverl and 3rd-party in C or C++. ART (Andr Runtime for end-user programs), net tools, sound, shell, graphics drivers, etc.
The interactng user-facing Android “framework” like UI elements, most apps, in Java.

.mk files, Makefiles, and the /build directory, create a flashable .zip from source, primarily located in /build directory.
The various components/programs which together make up Android are each built independently through Android-specific Android.mk.
The Android.mk generally exists for each sub-project (or “module”) in its source-code directory.
This file directs the build system on exactly how to build that module, and where to put it in Android’s directory structure.
The files, once built, goes in /out/target/project/CODENAME directory (CODENAME is code name of device).
From there, they are zipped up and the final flashable (by recovery) .zip and flashable (by fastboot) .img files are produced.

You peek at what’s been built there in /out, as the directories that are turned into the .img and .zip files are still around.
In addition to the /build directory, the Android source code is organized into a hierarchy of folders.
Take a look here at a brief description of what’s where in the source code.
The $OUT directory Helpful Tip: After you build, you can type cd$OUT to automatically go to the /out/target/project/CODENAME directory.

kernel This is the kernel, obviously.
/system — all the stuff that will become the /system folder on Android.
/root — files that are turned into the ram disk loaded and run by the kernel. The first program to be run by the kernel is called init, and it uses the init.rc and init.CODENAME.rc files to determine what happens next. See an discussion of that here.

/recovery/root The ramdisk that contains the recovery mode is here.

Shortcut commands every CM dev should know(your computer, not device).
$. build/envsetup.sh — Note the “.” at the beginning. This load environment variables to your shell and aliases for shortcuts. know more about “$ . build/envsetup.sh” to know more about the breakfast, brunch and lunch commands,see Envsetup_help page

croot — this command will take you to the root of the source code.
mm and mm -B — this is the “make module” command
very useful if working on a particular module and don’t need to rebuild everything. cd into the directory that you want to test, then just type mm to build just the module in the working directory. to buid from scratch, add the -B.
This is a good companion with adb sync system below, which you can use to push the newly built files directly to your device for testing without having to reflash everything.

Make
make modules — this command will show all available targets. You can build a single one by make my_target.
make showcommands — this command will enable the verbose build mode.
adb remount — If errored pushing files to /system due to it being in read-only mode, adb remount will remount /system into read-write mode— have root permissions. in liue of -o rw,remount /system (as root) or something.
diamonds$size[diamonds$carat >= 1] <- “Large”

## graphs
barplot(table(diamonds\$size), main=”Diamond Size Distribution”, xlab=”Size Category”, ylab=”Number of Diamonds”, col=”blue”)

Line charts
ggplot(diamonds, aes(clarity)) + geom_freqpoly(aes(group = color, colour = color)) +
labs(x=”Clarity”, y=”Number of Diamonds”, title=”Clarity by Color”)
Scatter plot:
ggplot(diamonds, aes(carat, price, color=clarity)) + geom_point() +
labs(x=”Carat Weight”, y=”Price”, title=”Price by Carat Weight”)
data <- matrix(scan(“birth.txt”), nrow=2, byrow=TRUE)

##
library(RODBC)
connection <- odbcConnect(“<DSN>”)
Once you have set up your connection, you could also use the sqlQuery() function to get data from .xls spreadsheets:
query <- “<SQL Query>”
data <- sqlQuery(connection, query)
str(data)
At the end of an R session, don’t forget to close the connections:
odbcCloseAll()

#DIF

library(jsonlite)
data <- fromJSON(“<Path to your JSON file>”)

library(RJSONIO)
data <- fromJSON(“<Path to your JSON file”)
## For large data . This is different from the read.table(), which creates a data frame of your data.
library(data.table)
library(data.table)
data <- fread(“http://assets.datacamp.com/blog_assets/chol.txt&#8221;, sep=auto, nrows = -1, na.strings = c(“NA”,”N/A”,””), stringsAsFactors=FALSE )

#sqldf
library(sqldf)
sql=”select * from file where …”,
colClasses=c(“character”, rep(“numeric”,10)))

Certified Big Data Analyst Hadoop Certification Courses
This Big data analytics & hadoop training program extensively covers big data and predictive analytics techniques using R and Hadoop. Candidates will get practical hands-on training on cutting edge tools and big data platforms, like R and Hadoop (MapReduce, Hbase, Hive, Pig, Oozie, Scoope and Flume).
This big data online training is crafted by experts using real life business datasets. As part of this program candidates get access to the virtual lab and several case studies on big datasets for extensive hands-on practice. At end of the program candidate would need to operationalize and complete a live project for an assimilated learning.
Who should attend this Big data analytics hadoop courses & training program? MBA Students/ IT professionals/ Recent graduates who want job in big data analytics/ data scientist role.
Certified Big Data Analytics Course Content (72 hours + practice sessions)
Business Analytics using R & Tableau
Introduction to R- environment
1. The Workspace
2. Input/ Output
3. Useful Packages (Base & other packages) in R
4. Graphic User Interfaces (R studio)
5. Customizing Startup
6. Batch Processing
7. Reusing Results
Data Input & Output (Importing & Exporting)
1. Data Structure & Data Types (Vectors, Matrices, factors, Data frames,  and Lists)
2. Importing Data (Importing data from csv, txt, Excel and other files)
3. Keyboard Input (Creating input by entering data)
4. Database Input (Connecting to database and use the data)
5. Exporting Data (Exporting files into different formats)
6. Viewing Data (Viewing partial data and full data)
7. Variable & Value Labels –  Date Values
8. Missing Data
Data Management
1. Creating New Variables (calculations & Binning)
2. Operators (Using multiple operators)
3. Built-in Functions & User Defined Function
4. Control Structures(conditional statements, Loops)
5. Sorting Data
6. Merging and Appending Data
7. Aggregating Data
8. Reshaping Data
9. Sub setting Data
10. Data Type Conversions
Visualization
1. Creating Graphs
2. Histograms & Density Plot
3. Dot Plots –  Bar Plots – Line Charts – Pie Charts – Boxplots – Scatterplots
Basic Statistics (Exploratory Analysis)
1. Descriptive Statistics(central tendency/variance)
2. Frequency Tables /Summarization
3. Hypothesis Testing
4. t-tests/z-test (1-sample, independent sample, paired sample)
5. Analysis of Variance(ANOVA)
6. Correlations/chi-square test
1. Introduction to predictive modeling & applications
2. Linear(Simple & Multiple) Regression
3. Logistic Regression
4. Introduction to segmentation
5. Segmentation using cluster analysis
Data Visualization using Tableau
1. Introduction to Tableau & Environment
2. Building basic views & sharing your work- overview
3. Data importing & manipulation
4. Maps/Tables/Calculated fields
5. Parameters
6. Data visualization with Charts maps
7. Building & customizing Reports
8. Building & customizing Dashboards
Machine Learning using R
1. What is Machine Learning?
2. Applications of Machine Learning Algorithms
3. Classification & Regression Problems
4. Training & Testing concepts – Cost & optimization functions
5. Artificial Neural Networks(ANN)
6. Support Vector Machines(SVM)
7. Decision Tress & Random Forest
8. Baysian Network case
Social Media Analytics using R
1. Social Media – Characteristics of Social Media
2. Applications of Social Media Analytics
3. Metrics(Measures Actions) in social media analytics
4. Examples & Actionable Insights using Social Media Analytics
5. Text Analytics – Sentiment Analysis using R
6. Text Analytics – Word cloud analysis using R
Projects (Applying Overall Learning)
1. Solve Business problems using R/Tableau
1. What is Big Data?
2. Types of Data
3. Characteristics of Big Data
4. Need for understanding Big Data (Application of Big Data)
5. Traditional Approaches and its limitations
6. Introduction to Hadoop and eco-system
7. Getting Started with Hadoop (software installation etc.)
2. Hadoop Cluster in commodity hardware
4. HDFS layer
5. HDFS operation principle
MapReduce
1. Introdution to MapReduce
4. Setting up your MapReduce Environment
5. Building a MapReduce Program
6. Input Formats in MapReduce
7. OutputFormats in MapReduce
8. Basic MapReduce Programming using R
1. Introduction to RHdfs, Rmr and Rhbase
2. Develop Map reduce code using R for Local & Hadoop env
4. Predictive analytics using R- Hadoop
5. Overview of Parallelization using R without Hadoop
Introduction to Flume & Sqoop
1. Introduction to Sqoop (Why, what, processing, under the hood)
2. Exporting data from Hadoop using Sqoop
3. Introduction to Flume
4. Flume Use Cases
5. Hands on Exercise using Flume and Sqoop
PIG
1. Introduction to PIG
2. Components of PIG
3. PIG Data Model
4. Creating Mapreduce programs using PIG
5. Hands on Exercise using PIG
HIVE
1. Introduction to HIVE and its characteristics
2. Components of HIVE
3. HIVE Data Models
4. Serialization/De-serialization
5. HIVE file formats
6. HIVE Query Language
7. HIVE Functions
8. Difference between HIVE and PIG
9. Hands on Exercise using HIVE
H-Base
1. HBase introduction and its Characteristics
2. HBase Architecture
3. Storage Model of HBase
4. When to use HBase
5. HBase Data Model
6. HBase Families
7. HBase Components
8. Data Storage
9. Hands on Exercise using Hbase
Mahout
1. Mahout introduction and its Characteristics
2. Mahout Architecture
3. When to use Mahout
4. What are the Machine Learing topics are covered in Mahout
5.  Hands on Exercise using Mahout
ZooKeeper
1. Introduction to ZooKeeper & its Features
2. Features of ZooKeeper
3. Challenges faced in distributed applications
4. Coordination
5. ZooKeeper: Goals and Uses
6. ZooKeeper: Entities, Data Model, Services
Misc Components
1. Overview of Apache Oozie
2. Overview of Storm
3. Overview of Apache Cassandra
4. Overview of Apache Spark
5. Overview of H2O
6. Social Media Analytics(Text Analysis, Word cloud)
Projects (Applying Overall Learning)