BDA Slips
Best Of Luck ๐
BDA Slips FYMCA (Engineering)
Best Of Luck ๐
BDA Slips FYMCA (Engineering)
!pip install pandas numpy matplotlib scikit-learn mlxtend
start-dfs.shstart-yarn.sh
jpshdfs dfs -mkdir /user/datahdfs dfs -put input.txt /user/data
hdfs dfs -cat /user/data/input.txt
hdfs dfs -rm /user/data/input.txt
hadoop jar /usr/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples.jar wordcount /user/data/input /user/output
hdfs dfs -cat /user/output/part-r-00000
stop-dfs.shstop-yarn.sh
hdfs dfs -rm -r /user/output.
File โ New File โ R Script).install.packages('tm')install.packages('wordcloud')install.packages('RColorBrewer')install.packages('ggplot2')
library(packageName).wordcloud(words=df$word, freq=df$freq, colors=brewer.pal(8, 'Dark2'))
BDA_Practicals and organize as:BDA_Practicals/
โโโ Classification/
โโโ Clustering/
โโโ Association_Rules/
โโโ Hadoop/
โโโ R_Programs/
โโโ WordCloud/
Shift + EnterCtrl + Enterstart-dfs.sh && start-yarn.shstop-dfs.sh && stop-yarn.shpandas, numpy, sklearn.from sklearn.datasets import load_iris.train_test_split and apply
KNeighborsClassifier.accuracy_score().matplotlib scatter plot.mlxtend library using !pip install mlxtend.from mlxtend.frequent_patterns import apriori, association_rules.
apriori(df, min_support=0.05, use_colnames=True).association_rules(frequent_itemsets, metric='lift', min_threshold=1).
from sklearn.naive_bayes import GaussianNB,
KNeighborsClassifier.
classification_report().
from sklearn.cluster import KMeans.StandardScaler.KMeans(n_clusters=3).from sklearn.svm import SVC.linear and rbf.confusion_matrix() and accuracy_score().hdfs dfs -put input.txt /user/input.hadoop jar /usr/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples.jar wordcount /user/input /user/output
hdfs dfs -cat /user/output/part-r-00000.start-dfs.sh, start-yarn.sh.hdfs dfs -mkdir /user/data.hdfs dfs -put sample.txt /user/data.hdfs dfs -cat /user/data/sample.txt.hdfs dfs -rm /user/data/sample.txt.iris or mtcars.hist()boxplot()plot()tm, wordcloud, RColorBrewer.TermDocumentMatrix().wordcloud(words, freq, colors=brewer.pal(8,'Dark2')).c(), seq(), mean(), median(), sd() for basic stats.barplot(), plot(), hist().pandas, numpy, matplotlib, scikit-learn, mlxtend, tm, wordcloud, RColorBrewer