#OpenMLEB

Summary

This week, 12 Pull requests were merged, 4 Pull requests were added, 6 Issues were closed, 18 Issues were added. A total of 353 file modifications were made, 36,056 lines of code were added and 879 lines of code were deleted.

Merged Pull Requests

  • feat: add integration test cicd#434
  • feat: add batchjob as java submodules#386
  • feat: add kubernetes java dependencies for taskmanager#400
  • fix: fix count in some yaml cases#436
  • feat: add a new optimization for expanding data in window skew optimization#424
  • feat: support insert multiple rows into a table using a single SQL insert statement#399
  • feat: support aggregation over the whole table#393
  • ci: openmldb java deploy workflow#366
  • test: rm DataSyncReplicaCluster test#415
  • feat: reconfiguration window skew optimization#414
  • feat: add integration test#395
  • Feat: Bump junit from 4.11 to 4.13.1 in/Java/OpenMLdb-batchJob #382

Open Pull Requests

  • Build (DEps): Bump snakeyAML from 1.17 to 1.26 in /test/batch-test/ openMLdb-batch-test #416
  • Build (deps): Bump HttpClient from 4.5.2 to 4.5.13 in /test/integration-test/openmldb-test-java/ openMLdb-test-common# 417
  • feat: support in predicate#423
  • feat: reorganize error code and use check_status and check_true#435

Close Issues

  • Make openmldb-batchjob and openmldb-taskmanager as submodules of openmldb-parent#385
  • Support submit and manage Kubernetes jobs for TaskManager#375
  • Bug: SQL INSERT Statement with multi rows does not work as expected#391
  • The rtiDB installed in Studio.4.2.0 starts CoreDump# 278
  • Support General aggegrate function over table COUNT, MAX, MIN, SUM#219
  • feat: support integration test for java/python sdk and offline batch#316

Open Issues

  • Add feature extraction tools like detecting data skew#433
  • feat: try run benchmark on GitHub workflow, compare & upload test results#432
  • feat: refactor error/warning log in hybridse#430
  • refactor yaml sql test case#427
  • feat: improve cli and make the console output more clean and clear#426
  • RFC: Redesign some interfaces of SQLClusterRouter#425
  • Create memtable when creating procedure#422
  • Sync metadata to hive metastore when creating iceberg table#421
  • Load data from iceberg to memtable#420
  • Get index from sql&procdure#419
  • Create message table and sync data to nearline tablet#418
  • Add optimization passes for native LastJoin#413
  • Enable optimization fo window parallel computation by default#412
  • Package OpenMLDB Spark distribution for release#411
  • Support table aggregation functions for Batch mode#410
  • Support passing Spark parameters for TaskManager#409
  • Refine the parameters from TaskManager API to support more job status#408
  • Integrated TaskManager API with OpenMLDB CLI tool#407

Contributors

Highlights

This week, the IntegrationTest module was added and integrated into the CICD process. The overall code was modified a lot, mainly by adding a large number of SQL test cases. Kubernetes dependency is added to the TaskManager module to support task management on multiple computing clusters. The BatchJob module was added into the Java project sub-module, and integrated into the complete CICD integration process. This week, the project officially passed the 2021 Trusted Open Source Project review of THE ICT, and upgraded the hadoop-common dependency version in the code to solve the potential risk problems and fix the project License dependency risk problems.

More developers are welcome to pay attention to and participate in OpenMLDB open source projects.