ETL pipeline documentation is great for team communication as well as data stewardship! Read my blog post to learn my tips and tricks.
Tag Archives: design data pipeline
Referring to columns in R can be done using both number and field name syntax. Although field name syntax is easier to use in programming, my blog demonstrates how you can use column numbers to make automation easier.
This lively panel discussed many topics around designing and implementing machine learning pipelines. Two main issues were identified. The first is that you really have to take some time to do exploratory research and define the problem. The second is that you need to also understand the business rules and context behind the data.