If you want to learn how to get information from your data with Transact-SQL, or shortly T-SQL language, then this course is the course for you. It will teach you how to calculate statistical measures from descriptive statistics including centers, spreads, skewness and kurtosis of a distribution, find the associations between pairs of variables, including calculating the linear regression formula, calculate the confidence level with definite integration, find the amount of information in your variables, and do also some machine learning or data science analysis, including predictive modeling and text mining.
T-SQL language in latest editions of SQL Server, Azure SQL Database, and Azure Synapse Analytics, has so many business intelligence (BI) improvements that it might become your primary analytic database system. Many database developers and administrators are already proficient with T-SQL. Occasionally they need to analyze the data with statistical or data science methods, but they do not want to or have time to learn a completely new language for these tasks. In addition, they need to analyze huge amounts of data, where specialized languages like R and Python might not be fast enough. SQL Server has been optimized for work with big datasets for decades.
In order to get the maximum out of these language constructs, you need to learn how to properly use them. This in-depth course shows extremely efficient statistical queries that use the window functions and are optimized through algorithms that use mathematical knowledge and creativity. The formulas and usage of those statistical procedures are explained as well.
Any serious analysis starts with data preparation. The course introduces some common data preparation tasks and shows how to implement them in T-SQL. No analysis is good without good data quality. The course introduces data quality issues, and shows how you can check for completeness and accuracy with T-SQL, and how to measure improvements of data quality over time. And since the talk is already about the time, the course shows how you can optimize queries with temporal data, for example when you search for overlapping intervals. More advanced time-oriented information includes hazard and survival analysis.
Then the course switches to the currently most fashionable topic, the data science. Some of quite advanced algorithms can also be implemented in T-SQL. The reader learns about the market basket analysis with association rules using different measures like support and confidence, and even sequential market basket analysis, when there is a sequence in the basket. Then the course shows how to develop predictive models with a mixture of k-nearest neighbor and decision trees algorithms and with Bayesian inference analysis.
Analyzing text, or text mining, is another modern topic. However, many people do not realize that you can do really a lot of text mining also in pure T-SQL. SQL Server can also become a text mining engine. The course shows how to analyze text in multiple natural languages with pure T-SQL, using also features from the full-text search (FTS).
In short, this course teaches you how to use T-SQL for:
- Statistical analysis
- Data science methods
- Text mining.
What Attendees Will Learn?
- Describe the distribution of a variable with statistical measures.
- Find associations between pairs of variables.
- Evaluate the quality of the data.
- Analyze data over time.
- Do the market basket analysis.
- Predict outcome of a target variable by using few input variables.
- Extract the key words from text data in order to categorize the texts.
Advanced Analytics with Transact-SQL is a course for database developers and database administrators who want to take their T-SQL programming skills to the max. It is for the attendees who want to analyze huge amounts of data in an efficient way by using their existing knowledge of the T-SQL language. It is also for the attendees who want to improve their querying by teaching them new and original optimization techniques.
- Descriptive Statistics
- Associations between Pairs of Variables
- Data Preparation
- Data Quality
- Time-Oriented Data
- Time-Oriented Analyses
- Data Mining
- Text Mining
Dejan Sarka, MCT and Data Platform MVP, is an independent trainer and consultant that focuses on development of database and business intelligence applications, with more than thirty years of experience in this field. Besides projects, he spends about half of his time on training and mentoring. He is the founder of the Slovenian SQL Server and .NET Users Group. Dejan Sarka is the main author or co-author of nineteen books about databases and SQL Server before this book. Dejan Sarka has also developed many courses and seminars for Microsoft, Radacad, SolidQ, and Pluralsight.