After Microsoft abandoned Windows and embraced Linux, the first SQL Server On Linux book was published in China, which taught a lot of new ways.

SQL Server as Microsoft’s famous database management system, for many years has been firmly in the top three of various database rankings. The SQL Server database management system, originally developed by James Gray, a Turing prize database winner, and based on Ingres developed by Michael Stonebraker, another Turing prize database winner, has been widely recognized and applied in the industry after more than 30 years of practice.

As a database product that already easily supports TB or even PB levels of data, the proper use of SQL Server is sufficient to meet most common requirements. Due to some reasons, SQL Server has not been widely regarded in China. This phenomenon has changed greatly since Satya Nadella, the third CEO of Microsoft, took office in 2014 and proposed “Microsoft Open Source”. As Microsoft products such as.net, Visual Studio and Office gradually realize cross-platform applications, SQL Server can officially run on mainstream Linux platforms in 2016 (currently it supports Red Hat/CentOS, Ubuntu and SUSE).

With the official release of SQL Server 2017, SQL Server in addition to cross-platform, but also introduced a large number of new functions, these functions not only to the database performance and database management efficiency brought significant improvement, but also accelerate the integration of SQL Server and big data, artificial intelligence field.

In view of the fact that there is no systematic introduction of books about SQL Server running On Linux in China at present, and the online information is too scattered, the author wrote “SQL Server On Linux Operation and Maintenance actual combat” after nearly a year of practice and data integration, in order to let readers get started as soon as possible. Master the installation, configuration, management and performance optimization methods of SQL Server based on Linux. Some of the book is not limited to, but mostly Linux.



SQL Server On Linux Operation and Maintenance

Authors: Huang Zhaoji, Xiong Zhangli



  • SQL Server phoenix nirvana, rebirth from the ashes;

  • Cross-platform, Linux and Windows;

  • From beginner to master, highlight operation and maintenance monitoring, performance tuning;

  • Integrated with the author’s 10 years of work experience, systematic, practical, well-read reference book.


By reading this book, you will learn how to use Linux-based SQL Server, how to better manage SQL Server on Linux, and how to perform general SQL Server performance tuning.

The new version of SQL Server running on Linux combines its new features and powerful functions with the advantages of Linux native system, which not only brings great convenience and improvement in operation and maintenance and performance. At the same time, the gradual compatibility of big data and data mining fields (such as built-in machine learning, graphics processing, compatibility with HDFS (Hadoop distributed file system), Containers, etc.) enables enterprises to avoid the need to build from scratch or migrate the existing environment to other relational databases “suitable” for the big data environment.

01

Why SQL Server on Linux

In the past, because big data was generated in Linux, SQL Server could not directly access HDFS (in fact, it has been accessible since SQL Server 2012, but it needs some auxiliary drivers) and could not run on Linux. Therefore, SQL Server considered it to be an abandoned child of The Times. Can only hold on to the existing one mu three areas, will struggle in the traditional field and slowly die.


Since the current CEO of Microsoft (Satya Nadtert) came to office, Microsoft has undergone earth-shaking changes, with the cloud first, mobile first strategy, the “Microsoft Open Source” general policy is deepening, Microsoft’s complacent image is gradually changing. SQL Server is Microsoft’s most important database product that is not exclusive to the cloud (Microsoft Cloud has other database products such as Azure Cosmos DB). In the era of data technology (DT) and artificial intelligence (AI), naturally, it also has revolutionary changes.


Readers who have worked in IT have heard, at some point, that “properly managed” Linux servers perform better than Windows Servers in terms of performance and stability in large-scale environments. One example is that Linux performs significantly better than Windows Server in terms of Server restart frequency and software updates and patch fixes.


In fact, SQL Server has been running on Linux since SQL Server 2016. The official release of SQL Server 2017 further enables seamless connectivity between Windows Server and mainstream Linux platforms. At the same time, not only SQL Server database engine, but also SQL Server Agent, some high availability technologies, etc., all can run on Linux platform.


Let’s take a look at why SQL Server on Linux was chosen.

The following is only the author’s summary and does not represent the official statement. The following is only the author’s summary and does not represent the official statement.

  • Big Trends: DT, AI, big data, IoT (Internet of Things), blockchain, etc., are the hottest things in recent years, and they all happen in the Linux ecosystem without exception. Not to say that Windows is bad, but from the current situation, we have to face Linux, Linux knowledge can keep up with the trend, not be eliminated by The Times. As a Microsoft technician, learning Linux alone is a bit of a resistance. Now you can combine SQL Server to learn, the author thinks this is the best entry point, no matter reasonable, give yourself a reason to learn is not bad.

  • New version new experience: Whether IT is SQL Server or other mainstream database management system, each new version brings a variety of new features, new experience. In the last few big releases (especially 2012 and later), YOU can see that SQL Server is constantly improving in the areas of data engines, traditional BI, and even big data. For example, we can use some functions (such as partition) to manage large libraries/tables, greatly improve the performance of OLTP by in-memory technology, achieve read/write separation and HA+DR scheme by AlwaysON technology, improve resource utilization and system stability, and improve the function of security. This book will introduce and demonstrate some of the more important features.

  • Advantages and disadvantages of pure SQL Server: With the exception of a few readers who are interested in SQL Server On Windows or SQL Server On Linux who may not have been exposed to SQL Server, most readers should have used SQL Server, regardless of the rankings, Objectively compare the current mainstream relational database management systems.

Mainstream relational database management systems

Open source: Although SQL Server is not open source, Oracle/DB2/SAP are not open source either.

Native load balancing: Although SQL Server does not support native load balancing, MySQL does not support native load balancing either. Oracle will have to pay through the nose.

Cross-platform: It is no longer a topic of discussion since SQL Server 2016.

Supported data amount: As for the larger system (such as a single table more than 100 million rows), MySQL I’m afraid I have to use depots table or other technology to share the load, the SQL Server is not necessarily, in the premise of rational design, hardware is sufficient, appropriate some processing (such as table partitioning, file group split, etc.), it is easy to deal with hundred million level table efficient operation.

Performance: Oracle without expert tuning is not necessarily better than SQL Server unless the business logic and data volume are determined to a certain extent. Similarly, Linux, untuned, is less secure than Windows. It’s all about how you use them.

Word of mouth: Maybe the author does not know much about it. So far, the author has never heard domestic famous Oracle experts (including ACE and ACE Director) say that other products (especially SQL Server) are not good in public. They are not clear about the differences between different products, but do not want to waste time and energy in these meaningless quarrels. If you are not proficient in two or more database products (the one on your resume doesn’t count), don’t compare and judge. Still the same sentence: not the product is really bad, but your level is limited. In fact, SQL Server live up to now, or it has its advantages. In addition, during the course of writing this book, I also read an article about SQL Server AlwaysON on Linux by an Oracle ACED person in The country. As the highest title granted by Oracle to non-internal employees, they are also learning SQL Server. The author is not talking about how SQL Server is good, but it is a matter of attitude, with an open and inclusive attitude, you can go far.

For more information on SQL Server performance on Linux, see the official documentation. Bing searches for “Microsoft, Red Hat, and HPE Collaboration Delivers Choice & Value to Enterprise Customers.”

02

Introduction to new features of SQL Server On Linux

Just like human evolution, traditional database software must make a lot of “evolution” and even “innovation” in order to cope with the development of The Times. Some of them are invisible to us. For example, FROM SQL Server 2000 to SQL Server 2005, a complete refactoring was introduced to add SQLOS, and before SQL Server 2016, most of our DMVS came from here. Starting with SQL Server On Linux/SQL Server 2017, SQL PAL was introduced, which is transparent to users and even DBAs. These are often the selling points of the product.

In recent years, SQL Server has released a major release every two years on average. Each major release contains a large number of new features. There is no need to elaborate on them one by one.

A complete list of new features is available in the official documentation, and you can search for “New features in SQL Server 2016” and its extended reading section in Bing.

According to the official installation process, this section also introduces four aspects: database engine, SSIS, machine learning and Compatibility with Linux. The small brackets before each feature indicate the version in which it first appeared.

2.1 Database Engine

Database management (including database security)

This part is the basic function of database management system. It makes the database “available”, and only the available database can continue to perform performance, high availability, and data utilization functions. SQL Server 2016 ~ 2017 has many new functions for database management or the original function of the enhancement, readers may use the following functions.

  • (2016+) TempDB Enhancement: Multiple TempDB data files can be configured during installation, which is only a ‘reminder’ to users because it is not fundamentally different from the past. In the past, splitting multiple files for TempDB was itself one of the best practices, but it was not present during installation.

  • (2016+) Temporal Tables: Officially called temporary Tables, but to avoid confusion with commonly used temporary Tables, this book uses a different name from the Bing translation, “Temporal Tables”, which record all data changes and their dates, as detailed in Section 7.5.

  • (2016+) Stretch Database: Dynamically and securely store data from the local Database to an Azure SQL Database. SQL Server can transparently query local data and remote data in the connected Database. It is suitable for archiving hot and cold data. Since this is not currently supported in the Linux version and requires the use of an Azure environment, it is not suitable for getting started, so it is not covered in this book.

  • (2016+) Always Encrypted: AE for short, commonly known as Always Encrypted in Chinese. When enabled, only applications with encryption keys can access encryption-sensitive data in the database, and the keys are never passed to SQL Server.

  • (2016+) Dynamic Data Masking: DDM for short, which is commonly called Dynamic Data Masking in Chinese. After using it, users without UNMASK privileges can only see masked data.

  • (2016+) row-level Security: RLS, commonly known as row-level Security. It can restrict data access at the database engine level so that users can only see rows of data associated with them.

  • Resumable Online Index Rebuild (2017+) Resumable Online Index Rebuild (Resumable Online Index Rebuild) It is similar to the pause function and is very helpful for the maintenance of large indexes.

  • (2017+) Improved backup performance of small databases on high-end servers: make backups smarter.

  • (2017+) Graphics Processing: With the increasing popularity and depth of data analysis, a large number of graphics can be processed in databases in familiar languages, but this section is beyond the scope of this book.

Database performance (including fault detection)

Performance and fault detection will be discussed in more detail in the Performance section, as each new release of SQL Server inevitably brings significant performance improvements. In the past, if you want to know what time to perform the SQL statements, what led to the performance issues, often can only be obtained from a large number of monitoring data, a lot of work, and sometimes do not get (such as downtime or file damage, etc.), and starting from the SQL Server 2016 provides query storage function can slow down this phenomenon well. The smart optimization features of SQL Server 2017 can also help enterprises without high-level DBAs to reduce their performance impact.

The author chooses several more characteristic functions, carries on certain demonstration and introduction.

  • (2016+) Query Store: Stores Query text, corresponding execution plans, and performance indicators, such as the SQL that takes the longest time and consumes the most CPU or memory. See Section 12.2 for details.

  • (2014+) In-memory OLTP: Introduced from SQL Server 2014 and continuously improved and enhanced In subsequent releases. It provides significant performance improvements over OLTP loads. See Chapter 16 for details.

  • (2017+) A new generation of query processors that optimize the runtime state of workloads, also known as adaptive query processing. In the past, performance analysis was often done after the fact, and a lot of valuable information was lost in the process. The new generation of query processors can be optimized for runtime, closer to “ideal” optimization. For details, see Introduction to performance.

  • (2017+) Automatic database optimization: SQL Server itself conducts in-depth research on potential performance problems, proposes solutions and automatically resolves identified problems. For details, see Introduction to performance.

Etc.), and the query storage function provided from SQL Server 2016 can alleviate this phenomenon. The smart optimization features of SQL Server 2017 can also help enterprises without high-level DBAs to reduce their performance impact.

High availability

In modern society, core systems, large systems almost invariably use some highly available technology. The main purpose of high availability is to keep the system online and serviced for as long as possible. As the high availability technology promoted by Microsoft, AlwaysON has been continuously strengthened and improved since the emergence of SQL Server 2012, which is worth learning.

  • (2017+) All AlwaysON databases support cross-database transactions. (2017+) Supports no clustering, minimum copy commit, and Windows-Linux migration across operating systems.

Since the topic of this book is SQL Server On Linux, and it is not intended to delve deeply into AlwaysON, which is enough to be a book by itself, the book is devoted to building a demonstration of AlwaysON On Linux. For further advanced readers, please refer to the official documentation or the author’s blog.

Big data

(2016+) PolyBase query engine: For integrating SQL Server with external data from Hadoop or Azure Blob, it can be imported, exported, and queried. This is beyond the scope of this book.

In addition to PolyBase, SQL Server Integration Services (SSIS) enhances big data collection with Power BI for data analysis and presentation. This brings a lot of convenience for SQL Server to participate in the system construction of big data.

2.2 SQL Server Integration Services

This section is beyond the scope of this book and can be further explored by searching for “SQL Server Integration Services” in Bing.

  • (2016+) Support for AlwaysON availability groups.

  • (2016+) Support Always Encrypted.

  • (2016+) Hadoop file system (HDFS) support.

  • (2017+) New Scale Out function.

2.3 Machine learning

Originally a SQL Server R service, starting in 2017, it is Python-compatible and has been renamed SQL Server Machine Learning Service. You can use a “machine learning service” (in-database) to run R or Python scripts in SQL Server, or a “machine Learning Server” (standalone) to deploy and use R and Python models that do not require SQL Server. These are outside the scope of this book, but interested readers can visit the official materials and use Bing to search “What is SQL Server Machine Learning Services?”

Although this book does not mention, but this service will be a key service in the future, I suggest readers pay more attention to it.

2.4 Compatible with Linux platforms

SQL Server can run on Both Linux and Windows platforms, and most of the functionality is the same, with the exception of some platform-specific functionality differences, which the authors believe will become compatible over time. Because SQL Server On Linux is the main content of this book, I will not expand On it here.

2.5 other

Except for the above four categories, SQL Server on Linux and Windows are the same database engine, and most of the functions are the same. SQL Server On Linux supports Docker as well as Linux.

As of October 2017, SQL Server still has a number of features that need to be added gradually. For a full list of features, see the official website. Use Bing to search for “versions and support features” in the overview of “SQL Server on Linux”.

03

SQL Server On Linux

In conceiving this book, the author thought that it should be, first and foremost, a reference book, worthy of being read at any time; otherwise, it would be outdated and worthless; Second, it should be an introduction to the system knowledge book (this book is written more as a DBA perspective), not systematic knowledge is difficult to master.

How to use SQL Server On Linux depends On your actual requirements, but in addition to industry-specific features, we can consider using some of the following features for the following requirements.

Performance: Using in-memory OLTP greatly improves performance for routine OLTP operations, and using Columnstore technology for relatively static data, Columnstore is particularly suited for efficient queries to data warehouses. At the same time, Automatic Plan correction and Adaptive Query Processing can be used to optimize and improve database performance. AlwaysON technology can also be used for more intelligent load balancing of “read operations.”

HA/DR: SQL Server 2017 provides two different availability group architectures, AlwaysOn and Read-Scale. In a Linux environment, the former uses Pacemaker for Linux clustering for high availability, disaster recovery, and read load balancing. Read-scale, on the other hand, provides read-only copies to distribute the load of Read operations without the need for a cluster administrator, making it particularly suitable for hybrid operating system (Windows-Linux) environments. Note that it does not have high availability.

Security: TDE (Transparent data encryption) is used to protect the database security, and full-link encryption is realized by using the Always Encrypted function. Encryption functions such as RLS and DDM can also be used according to the actual security level required. It is important to remind that security is a technical issue as well as a management issue. No technology can completely eliminate security risks.

Operations: the focus of the operational problems is to make the system available, except for some high availability technology, also need to consider the historical data, disaster backup files of capacity, in addition to the daily maintenance of large system is one of the key, in addition to providing the normal function of SQL Server in the past, also can consider to use the following three functions work make operations more efficient.

  • Use Stretch Database to store historical data on Azure, enabling the highest performance resources to be dedicated to active data.

  • Improve the Rebuild of large indexes using the Resumable Online Index Rebuild feature.

  • With Azure, you can back up your database directly to Azure, reducing storage costs and risk of file corruption.

Big Data /AI: SQL Server runs on Linux with built-in Python and R support, graphics processing (available from SQL Server 2017), rich ARTIFICIAL intelligence, machine learning, and HDInsight capabilities on Azure, with PowerBI as data presentation, Can do most big data and even AI work very well. In the next few years, the authors believe that Microsoft will put a lot of effort in this area to make SQL Server more than just a traditional database management system.

04

Structure of the book

Here is a general introduction to the structure of this book, the goal of this book is to make it not only as an introductory book, but also as a reference book, at any time to read, so in the series of various knowledge points under the premise, but also try to take into account the relative independence of each chapter.


Simply put, because SQL Server has been running on Windows for 30 or 40 years and Linux for just over a year, plus Linux itself is just a “kernel.” Considering that even a simple Server requires a lot of configuration, SQL Server requires a lot of configuration on Linux, and many interface operations are simplified or even cancelled on Linux. But overall, it doesn’t make a huge difference.



SQL Server On Linux Operation and Maintenance

Authors: Huang Zhaoji, Xiong Zhangli






  • SQL Server phoenix nirvana, rebirth from the ashes;

  • Cross-platform, Linux and Windows;

  • From beginner to master, highlight operation and maintenance monitoring, performance tuning;

  • Integrated with the author’s 10 years of work experience, systematic, practical, well-read reference book.

– END –