CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutSign UpSign In

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.

| Download
Views: 28
Image: ubuntu2204
Kernel: Python 3 (system-wide)
""" Q1: B Q2: A Q3: C Q4: A Q5: B Q6: C Q7: B Q8: D Q9: C Q10: C """
'\nQ1: B\nQ2: A\nQ3: C\nQ4: A\nQ5: B\nQ6: C\nQ7: B\nQ8: D\nQ9: C\nQ10: C\n'
from sklearn.datasets import load_iris import pandas as pd import seaborn as sns import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression from sklearn.metrics import accuracy_score, classification_report, confusion_matrix # Load Iris dataset iris = load_iris() data = pd.DataFrame(data=iris.data, columns=iris.feature_names) data['target'] = iris.target # Check for missing values data.isnull().sum() # Basic statistics data.describe() # Explore class distribution sns.countplot(x='target', data=data) plt.show() # Pairplot to visualize relationships between features sns.pairplot(data, hue='target') plt.show() X = data.drop('target', axis=1) y = data['target'] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # Initialize and train a classification model (e.g., Logistic Regression) model = LogisticRegression() model.fit(X_train, y_train) # Predict using the test set y_pred = model.predict(X_test) # Evaluate model performance accuracy = accuracy_score(y_test, y_pred) print(f"Accuracy: {accuracy}") # Other classification metrics print(classification_report(y_test, y_pred)) # Confusion matrix conf_matrix = confusion_matrix(y_test, y_pred) sns.heatmap(conf_matrix, annot=True, cmap="YlGnBu") plt.xlabel('Predicted') plt.ylabel('Actual') plt.show()
Image in a Jupyter notebookImage in a Jupyter notebook
Accuracy: 1.0 precision recall f1-score support 0 1.00 1.00 1.00 10 1 1.00 1.00 1.00 9 2 1.00 1.00 1.00 11 accuracy 1.00 30 macro avg 1.00 1.00 1.00 30 weighted avg 1.00 1.00 1.00 30
Image in a Jupyter notebook
""" i. This function, unknown_function, is intended to compute various statistical measures given a list of floating-point numbers as input: Mean: Calculates the average value of the input list using statistics.mean() and stores it in the output dictionary under the key "mean". Median: Determines the middle value of the input list using statistics.median() and stores it in the output dictionary under the key "median". Standard Deviation: Computes the measure of dispersion or spread of the values using statistics.stdev(). If the length of the input list is greater than 1, it computes the standard deviation; otherwise, it sets the value to 0.0 to prevent errors and stores it in the output dictionary under the key "stdev". Maximum Value: Finds the maximum value within the input list using the max() function and stores it in the output dictionary under the key "max_value". Minimum Value: Finds the minimum value within the input list using the min() function and stores it in the output dictionary under the key "min_value". The function returns a dictionary ('stat_dict') containing these statistical measures. If the input list is empty, it returns an empty dictionary as there are no values to compute statistics from. ii. The function utilizes random.sample() to generate a random sample of sample_size elements from the input my_list. It includes error handling using try and except. If the assertion fails due to the sample size exceeding the length of the input list, an AssertionError is raised with a corresponding error message. In case of an AssertionError, the function catches the exception, prints the traceback using traceback.print_exc(), and returns None. If the assertion passes successfully, it returns a random sample of elements taken from the input list based on the specified sample_size. """