100% Passing Guarantee - Brilliant Databricks-Certified-Professional-Data-Scientist Exam Questions PDF [Oct-2021]
Databricks-Certified-Professional-Data-Scientist Dumps 2021 - NewDatabricks Databricks-Certified-Professional-Data-Scientist Exam Questions
NEW QUESTION 20
You have modeled the datasets with 5 independent variables called A,B,C,D and E having relationships which is not dependent each other, and also the variable A,B and C are continuous and variable D and E are discrete (mixed mode).
Now you have to compute the expected value of the variable let say A, then which of the following computation you will prefer
- A. Transformation
- B. Differentiation
- C. Integration
- D. Generalization
Answer: C
Explanation:
Explanation
Text Description automatically generated
Text Description automatically generated
Text Description automatically generated
NEW QUESTION 21
Select the statement which applies correctly to the Naive Bayes
- A. Works with nominal values
- B. Works with a small amount of data
- C. Sensitive to how the input data is prepared
Answer: A,B,C
NEW QUESTION 22
In which of the scenario you can use the regression to predict the values
- A. All 1 ,2 and 3
- B. Samsung can use it for mobile sales forecast
- C. Probability of the celebrity divorce
- D. Mobile companies can use it to forecast manufacturing defects
- E. Only 1 and 2
Answer: A
Explanation:
Explanation
Regression is a tool which Companies may use this for things such as sales forecasts or forecasting manufacturing defects. Another creative example is predicting the probability of celebrity divorce.
NEW QUESTION 23
You are creating a model for the recommending the book at Amazon.com, so which of the following recommender system you will use you don't have cold start problem?
- A. Naive Bayes classifier
- B. Content-based filtering
- C. User-based collaborative filtering
- D. Item-based collaborative filtering
Answer: B
Explanation:
Explanation
The cold start problem is most prevalent in recommender systems. Recommender systems form a specific type of information filtering (IF) technique that attempts to present information items (movies, music, books, news, images, web pages) that are likely of interest to the user. Typically, a recommender system compares the user's profile to some reference characteristics. These characteristics may be from the information item (the content-based approach) or the user's social environment (the collaborative filtering approach). In the content-based approach, the system must be capable of matching the characteristics of an item against relevant features in the user's profile. In order to do this, it must first construct a sufficiently-detailed model of the user's tastes and preferences through preference elicitation. This may be done either explicitly (by querying the user) or implicitly (by observing the user's behaviour). In both cases, the cold start problem would imply that the user has to dedicate an amount of effort using the system in its 'dumb' state - contributing to the construction of their user profile - before the system can start providing any intelligent recommendations.
Content-based filtering recommender systems use information about items or users to make recommendations, rather than user preferences, so it will perform well with little user preference data. Item-based and user-based collaborative filtering makes predictions based on users' preferences for items, os they will typically perform poorly with little user preference data. Logistic regression is not recommender system technique.
NEW QUESTION 24
Regularization is a very important technique in machine learning to prevent overfitting. Mathematically speaking, it adds a regularization term in order to prevent the coefficients to fit so perfectly to overfit. The difference between the L1 and L2 is...
- A. None of the above
- B. L2 is the sum of the square of the weights, while L1 is just the sum of the weights
- C. L1 gives Non-sparse output while L2 gives sparse outputs
- D. L1 is the sum of the square of the weights, while L2 is just the sum of the weights
Answer: B
Explanation:
Explanation
Regularization is a very important technique in machine learning to prevent overfitting. Mathematically speaking, it adds a regularization term in order to prevent the coefficients to fit so perfectly to overfit. The difference between the L1 and L2 is just that L2 is the sum of the square of the weights, while L1 is just the sum of the weights. As follows: L1 regularization on least squares:
A picture containing text Description automatically generated
NEW QUESTION 25
Which of the following question statement falls under data science category?
- A. What happened in last six months?
- B. Where is a problem for sales?
- C. Which is the optimal scenario for selling this product?
- D. What happens, if these scenario continues?
- E. How many products have been sold in a last month?
Answer: C,D
Explanation:
Explanation
This question wants to check your understanding about Bl and Data Science. Bl was already existing and analytics team already using it. They need to improve and learn data science technique to solve some problems. If you check the option given in the question, it will confuse you. But if you have worked in Bl or as a Data Scientist then it is easy to answer. First 3 option can be easily answered using reporting solution, what sales happened in last six month, what was the problem etc.
But for the last two option you need to apply data science techniques like which all scenarios are optimal for product sales, you need to collect the data and applying various techniques for that. Hence, last two option can only be answered using Data Science technique And for this you need to apply techniques like Optimization, predictive modeling, statistical analysis on structured and un-structured data.
NEW QUESTION 26
Suppose you have been given two Random Variables X and Y, whose joint distribution is already known, the marginal distribution of X is simply the probability distribution of X averaging over information about Y.
It is the probability distribution of X when the value of Y is not known. So how do you calculate the marginal distribution of X
- A. This is typically calculated by integrating(ln case of continuous variable) the joint probability distribution over Y.
- B. This is typically calculated by integrating the joint probability distribution over Y
- C. This is typically calculated by summing the joint probability distribution over Y.
- D. This is typically calculated by summing (In case of discrete variable) the joint probability distribution over Y
Answer: A,B,C,D
Explanation:
Explanation
Given two random variables X and Y whose joint distribution is known, the marginal distribution of X is simply the probability distribution of X averaging over information about Y.
It is the probability distribution of X when the value of Y is not known. This is typically calculated by summing or integrating the joint probability distribution over Y. ' For discrete random variables, the marginal probability mass function can be written as Pr(X = x). This is Text Description automatically generated with low confidence
where Pr(X = x,Y = y) is the joint distribution of X and Y, while Pr(X = x|Y = y) is the conditional distribution of X given Y In this case, the variable Y has been marginalized out.
Bivariate marginal and joint probabilities for discrete random variables are often displayed as two-way tables.
Similarly for continuous random variables, the marginal probability density function can be written as pX(x). This is Diagram Description automatically generated with medium confidence
where pX.Y(x.y) gives the joint distribution of X and Y while pX|Y(x|y) gives the conditional distribution for X given Y Again: the variable Y has been marginalized out.
Note that a marginal probability can always be written as an expected value:
Text, letter Description automatically generated
Intuitively, the marginal probability of X is computed by examining the conditional probability of X given a particular value of Y, and then averaging this conditional probability over the distribution of all values of Y This follows from the definition of expected value, i.e. in general A picture containing diagram Description automatically generated
NEW QUESTION 27
Find out the classifier which assumes independence among all its features?
- A. Linear Regression
- B. Naive Bayes
- C. Neural networks
- D. Random forests
Answer: B
Explanation:
Explanation
A Bayes classifier is a simple probabilistic classifier based on applying Bayes' theorem (from Bayesian statistics) with strong (naive) independence assumptions. A more descriptive term for the underlying probability model would be "independent feature model".
A Bayes classifier is a simple probabilistic classifier based on applying Bayes' theorem (from Bayesian statistics) with strong (naive) independence assumptions. A more descriptive term for the underlying probability model would be "independent feature model".
In simple terms, a naive Bayes classifier assumes that the presence (or absence) of a particular feature of a class is unrelated to the presence (or absence) of any other feature. For example, a fruit may be considered to be an apple if it is red, round, and about 4" in diameter Even if these features depend on each other or upon the existence of the other features, a naive Bayes classifier considers all of these properties to independently contribute to the probability that this fruit is an apple.
NEW QUESTION 28
You are working with the Clustering solution of the customer datasets. There are almost 40 variables are available for each customer and almost 1.00,0000 customer's data is available. You want to reduce the number of variables for clustering, what would you do?
- A. You will randomly reduce the number of variables
- B. You will find the correlation among the variables and from their variables are not co-related will be discarded.
- C. You will find the correlation among the variables and from the highly co-related variables, you will be considering only one or two variables from it.
- D. You can combine several variables in one variable
- E. You cannot discard any variable for creating clusters.
Answer: C,D
Explanation:
Explanation
When you are applying clustering technique and you find that there are quite a huge number of variables are available. Then it is better the find the co-relation among the variables and consider only one or two variables from the highly co-related variables. Because highly co-related variable will have the same effect, while creating the cluster. We can use scatter plot matrix among the variables to find the co-relation.
You can also combine several variables into a single variable. For example if you have two values in the dataset like Asset and Debt than by combining these two values like Debt to Asset ratio and use it while creating the cluster.
NEW QUESTION 29
Which activity is performed in the Operationalize phase of the Data Analytics Lifecycle?
- A. Transform existing variables
- B. Try different variables
- C. Try different analytical techniques
- D. Define the process to maintain the model
Answer: D
Explanation:
Explanation
Operationalize In the final phase, the team communicates the benefits of the project more broadly and sets up a pilot project to deploy the work in a controlled way before broadening the work to a full enterprise or ecosystem of users. In Phase 4. the team scored the model in the analytics sandbox.
NEW QUESTION 30
In which phase of the analytic lifecycle would you expect to spend most of the project time?
- A. Discovery
- B. Operationalize
- C. Communicate Results
- D. Data preparation
Answer: D
Explanation:
Explanation
In the data preparation phase of the Data Analytics Lifecycle, the data range and distribution can be obtained.
If the data is skewed, viewing the logarithm of the data (if it's all positive) can help detect structures that might otherwise be overlooked in a graph with a regular, nonlogarithmic scale.
When preparing the data, one should look for signs of dirty data, as explained in the previous section. Examining if the data is unimodal or multimodal will give an idea of how many distinct populations with different behavior patterns might be mixed into the overall population. Many modeling techniques assume that the data follows a normal distribution. Therefore, it is important to know if the available dataset can match that assumption before applying any of those modeling techniques.
NEW QUESTION 31
Select the correct algorithm of unsupervised algorithm
- A. K-Nearest Neighbors
- B. Naive Bayes
- C. K-Means
- D. Support Vector Machines
Answer: A
Explanation:
Explanation
Sup Supervised learning tasks
Classification Regression
k-Nearest Neighbors Linear
Naive Bayes Locally weighted linear
Support vector machines Ridge
Decision trees Lasso
Unsupervised learning tasks Clustering Density estimation k-Means Expectation maximization DBSCAN Parzen window
NEW QUESTION 32
Select the sequence of the developing machine learning applications
A) Analyze the input data
B) Prepare the input data
C) Collect data
D) Train the algorithm
E) Test the algorithm
F) Use It
- A. C, B, A, D, E, F
- B. A, B, C, D, E, F
- C. C, A, B, D, E, F
- D. C, B, A, D, E, F
Answer: D
Explanation:
Explanation
1 Collect data. You could collect the samples by scraping a website and extracting data: or you could get information from an RSS feed or an API. You could have a device collect wind speed measurements and send them to you, or blood glucose levels, or anything you can measure. The number of options is endless. To save some time and effort you could use publicly available data
2 Prepare the input data. Once you have this data, you need to make sure it's in a useable format. The format we'll be using in this book is the Python list. We'll talk about Python more in a little bit, and lists are reviewed in appendix A.
The benefit of having this standard format is that you can mix and match algorithms and data sources. You may need to do some algorithm-specific formatting here. Some algorithms need features in a special format, some algorithms can deal with target variables and features as strings, and some need them to be integers. We'll get to this later but the algorithm-specific formatting is usually trivial compared to collecting data.
3 Analyze the input data. This is looking at the data from the previous task. This could be as simple as looking at the data you've parsed in a text editor to make sure steps 1 and 2 are actually working and you don't have a bunch of empty values. You can also look at the data to see if you can recognize any patterns or if there's anything obvious^ such as a few data points that are vastly different from the rest of the set. Plotting data in one: two, or three dimensions can also help. But most of the time you'll have more than three features, and you can't easily plot the data across all features at one time. You could, however use some advanced methods we'll talk about later to distill multiple dimensions down to two or three so you can visualize the data.
4 If you're working with a production system and you know what the data should look like, or you trust its source: you can skip this step. This step takes human involvement, and for an automated system you don't want human involvement. The value of this step is that it makes you understand you don't have garbage coming in.
5 Train the algorithm. This is where the machine learning takes place. This step and the next step are where the "core" algorithms lie, depending on the algorithm.You feed the algorithm good clean data from the first two steps andextract knowledge or information. This knowledge you often store in a formatthat's readily useable by a machine for the next two steps.In the case of unsupervised learning, there's no training step because youdon't have a target value. Everything is used in the next step.
6 Test the algorithm. This is where the information learned in the previous step isput to use. When you're evaluating an algorithm, you'll test it to see how well itdoes. In the case of supervised learning, you have some known values you can use to evaluate the algorithm. In unsupervised learning, you may have to use some other metrics to evaluate the success. In either case, if you're not satisfied, you can go back to step 4, change some things, and try testing again. Often thecollection or preparation of the data may have been the problem, and you'll have to go back to step 1.
7 Use it. Here you make a real program to do some task, and once again you see if all the previous steps worked as you expected. You might encounter some new data and have to revisit steps 1-5.
NEW QUESTION 33
Suppose A, B , and C are events. The probability of A given B , relative to P(|C), is the same as the probability of A given B and C (relative to P ). That is,
- A. P(A,B|C) P(B|C) =P(B|A,C)
- B. P(A,B|C) P(B|C) =P(A|C,B)
- C. P(A,B|C) P(B|C) =P(C|B,C)
- D. P(A,B|C) P(B|C) =P(A|B,C)
Answer: D
Explanation:
Explanation
From the definition, P(A,B|C) P(B|C) =P(A,B.C)/P(C) P(B.C)/P(C) =P(A,B.C) P(B,C) =P(A|BC) This follows from the definition of conditional probability, applied twice: P(A,B)=(PA|B)P(B)
NEW QUESTION 34
You are having 1000 patients' data with the height and age. Where age in years and height in meters. You wanted to create cluster using this two attributes. You wanted to have near equal effect for both the age and height while creating the cluster. What you can do?
- A. You will be adding height with the numeric value 100
- B. You will be dividing both age and height with their respective standard deviation
- C. You will be converting each height value to centimeters
- D. You will be taking square root of height
Answer: B,C
Explanation:
Explanation
When you see the data age in years would have values like 50, 60r 70 90 years etc. And while calculating distance from centroid maximum possible value can be 90-0 and its square will be 8100.
While using heights in meter can be 2-0.5(1.5) meters and its square will be 2.25 only. So you can see age has more effect than height. Hence bringing the height on same level you can convert it into centimeters. Can bring data upto 200 centimeters and then it be more effective like square of 200 maximum.
However there is another approach is to divide the each value with its standard deviation, which will not have impact of the units e.g. age/sd of the age, which results in value without unit. This can also help in reducing the effect of units.
NEW QUESTION 35
What describes a true limitation of Logistic Regression method?
- A. It does not have explanatory values.
- B. It does not handle missing values well.
- C. It does not handle redundant variables well.
- D. It does not handle correlated variables well.
Answer: B
NEW QUESTION 36
......
Databricks Databricks-Certified-Professional-Data-Scientist Exam Syllabus Topics:
| Topic | Details |
|---|---|
| Topic 1 |
|
| Topic 2 |
|
| Topic 3 |
|
| Topic 4 |
|
| Topic 5 |
|
| Topic 6 |
|
| Topic 7 |
|
Free Databricks-Certified-Professional-Data-Scientist braindumps download: https://www.braindumpsit.com/Databricks-Certified-Professional-Data-Scientist_real-exam.html