What are Database Constraints?
Database Constraints are rules/restrictions that are applied to the database table columns to limit the type of data that can be persisted in a table. It provides a way to…
Database Constraints are rules/restrictions that are applied to the database table columns to limit the type of data that can be persisted in a table. It provides a way to…
Introduction to Data Lake A Data Lake is a centralized data-centric storage architecture that is used for persisting a variety of data in its raw, unfiltered, and untransformed format. It…
JDBC stands for Java Database Connectivity. It is a standard Java API for interacting with a range of databases. We can access virtually any data source ranging from, relational databases…
If you are switching from Hortwonwork Data Platform(HDP) 2.6 To 3.0+, you will have a hard time accessing Hive Tables through the Apache Spark shell. HDP 3 introduced something called…
curl is a command or tool that can transfer data from or to a server using any of the supported protocols. curl supports most of the major protocols and is…
A function is a small piece of code or command which can be used multiple times in a script. A function in bash can be created using the function keyword.…
There are many layers in Enterprise Data warehouse(EDW) such as Integration/Semantic/Performance which serve their own purpose. In this blog post, we would go into detail about each of these layers.…
We can automate different tasks in Linux including generating reports, monitoring processes, automating backups, submitting Spark jobs, and many more. There are times when jobs fail and a report needs…
UDAF stands for User Defined Aggregate functions. Aggregate functions are used to perform a calculation on a set of values and return a single value. It is difficult to write an aggregate function compared to writing a User Defined Functions(UDF) as we need to aggregate on multiple rows and columns. Apache Spark UDAF operates on more than one row or Column while returning a single value results
With the introduction of Apache Hive 3, Apache Hadoop has introduced different new features to address the growing needs of enterprise data warehouse systems. This blog post talks about several…