Google Cloud Data Engineering
The depth and intensity of reading and practice required for Data Engineering Exam
- 2.How the workers of a cluster can download dependencies from the internet, if the cluster nodes have no egress/ingress allowance.
- 3.How Cloud Datastore and Spanner and Spanner vs Bigtable are different?
- 4.2 PB of data (key, value) where will you store, Datastore, Spanner, Bigtable?. What about 1 TB?
- 5.Storage transfer service vs transfer appliance. Can you use a private web address to transfer 2 PB of data over six months with storage transfer service? docs: https://cloud.google.com/storage-transfer/docs/overview https://cloud.google.com/storage-transfer/docs/on-prem-overview#requirements
- 6.BigQuery partitioning for easy querying, with a timestamp and unique ID for the dataset?
- 7.Dataflow template vs DAG on Cloud Composer for running spark in which some of the jobs in sequence and others concurrent? docs: https://cloud.google.com/composer/docs/how-to/using/using-dataflow-template-operator
Mysqlplugin for MariaDB with Stack driver agent?
- 10.Cloud ML vs Dataproc spark ML from existing spark ML models? And where do you store data cloud storage or bigquery?
- 14.How to do you improve Area Under Curve (AUC)? - Hyperparameter tuning, model deployment?