Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.
Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.
Path: blob/main/deep-learning-specialization/course-5-sequence-models/Week 1 Quiz - Recurrent Neural Networks.md
Views: 34205
Week 1 Quiz - Recurrent Neural Networks
1. Suppose your training examples are sentences (sequences of words). Which of the following refers to the word in the training example?
...
π The parentheses represent the training example and the brackets represent the word. You should choose the training example and then the word.
2. Consider this RNN:
True/False: This specific type of architecture is appropriate when
True
False
π This type of architecture is for applications where the input and output sequence length is the same.
3. Select the two tasks combination that could be addressed by a many-to-one RNN model architecture from the following:
Task 1: Speech recognition. Task 2: Gender recognition
Task 1: Image classification. Task 2: Sentiment classification.
Task 1: Gender recognition from audio. Task 2: Movie review (positive/negative) classification.
Task 1: Gender recognition from audio. Task 2: Image classification.
π Gender recognition from audio and movie review classification are two examples of many-to-one RNN architecture
4. Using this as the training model below, answer the following:
True/False: At the t time step the RNN is estimating
True
False
π In a training model we try to predict the next step based on knowledge of all prior steps.
5. You have finished training a language model RNN and are using it to sample random sentences, as follows:
True/False: In this sample sentence, step t uses the probabilities output by the RNN to randomly sample a chosen word for that time-step. Then it passes this selected word to the next time-step.
True
False
π Step t uses the probabilities output by the RNN to randomly sample a chosen word for that time-step. Then it passes this selected word to the next time-step.
6. True/False: If you are training an RNN model, and find that your weights and activations are all taking on the value of NaN (βNot a Numberβ) then you have an exploding gradient problem.
True
False
π Exploding gradients happen when large error gradients accumulate and result in very large updates to the NN model weights during training. These weights can become too large and cause an overflow, identified as NaN.
7. Suppose you are training an LSTM. You have an 80000 word vocabulary, and are using an LSTM with 800-dimensional activations . What is the dimension of at each time step?
800
...
π is a vector of dimension equal to the number of hidden units in the LSTM.
8. True/False: In order to simplify the GRU without vanishing gradient problems even when training on very long sequences you should remove the i.e., setting always.
True
False
π If for a timestep, the gradient can propagate back through that timestep without much decay. For the signal to backpropagate without vanishing, we need to be highly dependent on .
9. True/False: Using the equations for the GRU and LSTM below the Update Gate and Forget Gate in the LSTM play a role similar to and .
True
False
π No. Instead of using to compute , LSTM uses 2 gates ( and ) to compute the final value of the hidden state. So, is used instead of .
10. Your mood is heavily dependent on the current and past few daysβ weather. Youβve collected data for the past 365 days on the weather, which you represent as a sequence as . Youβve also collected data on your mood, which you represent as . Youβd like to build a model to map from x β y. Should you use a Unidirectional RNN or Bidirectional RNN for this problem?
Unidirectional RNN, because the value of depends only on , but not on .