BDA Slips

Best Of Luck 👍

BDA Slips FYMCA (Engineering)

Install Anaconda → Download Here.
Open Anaconda Navigator → Launch Jupyter Notebook.
Create a new file: New → Python 3 (ipykernel).
Write or paste your BDA Python code (e.g., KNN, Naïve Bayes, K-Means, Apriori, etc.).
Press Shift + Enter to run the code cell.
To install missing libraries, use inside a cell:
!pip install pandas numpy matplotlib scikit-learn mlxtend
Save output: File → Save and Checkpoint.

💡 Tip: Store all datasets (CSV/Excel) in the same folder as your notebook for easy file access.

Open your Hadoop terminal (Ubuntu or Cloudera VM).
Start Hadoop services:
start-dfs.sh
start-yarn.sh
Check status: jps
👉 Add file to HDFS:
hdfs dfs -mkdir /user/data
hdfs dfs -put input.txt /user/data
👉 View file:
hdfs dfs -cat /user/data/input.txt
👉 Delete file:
hdfs dfs -rm /user/data/input.txt
👉 Run WordCount MapReduce Example:
hadoop jar /usr/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples.jar wordcount /user/data/input /user/output
View output:
hdfs dfs -cat /user/output/part-r-00000
Stop services:
stop-dfs.sh
stop-yarn.sh

⚠️ Note: Always delete old output folders before rerunning jobs using hdfs dfs -rm -r /user/output.

Install R and RStudio → Download Here.
Open RStudio and create a new R script (File → New File → R Script).
Write or paste your R code (e.g., statistical visualization, boxplots, WordCloud, etc.).
Run line by line using Ctrl + Enter.
Install required packages if missing:
install.packages('tm')
install.packages('wordcloud')
install.packages('RColorBrewer')
install.packages('ggplot2')
Load libraries using library(packageName).
Execute your script and view plots in the “Plots” panel.

✅ Example WordCloud Command:
wordcloud(words=df$word, freq=df$freq, colors=brewer.pal(8, 'Dark2'))

📁 Project Folder Setup:
Create a folder named BDA_Practicals and organize as:

BDA_Practicals/
    ├── Classification/
    ├── Clustering/
    ├── Association_Rules/
    ├── Hadoop/
    ├── R_Programs/
    └── WordCloud/

💡 Tip: Always test small datasets first, verify output visually, and document screenshots for practical submission.