diff --git a/.github/workflows/build.yml b/.github/workflows/build.yml index 50b37fa6..fcf47f3a 100644 --- a/.github/workflows/build.yml +++ b/.github/workflows/build.yml @@ -9,12 +9,12 @@ on: jobs: build: - runs-on: ubuntu-latest + runs-on: ubuntu-18.04 env: ACTIONS_ALLOW_UNSECURE_COMMANDS: 'true' strategy: matrix: - python-version: [3.6, 3.7, 3.8] + python-version: [3.6.7] steps: - uses: actions/checkout@v2 diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml index 423c7d5a..154bbf46 100644 --- a/.github/workflows/docs.yml +++ b/.github/workflows/docs.yml @@ -5,11 +5,11 @@ on: - master jobs: deploy: - runs-on: ubuntu-latest + runs-on: ubuntu-18.04 steps: - uses: actions/checkout@v2 - uses: actions/setup-python@v2 with: - python-version: 3.6 + python-version: 3.6.7 - run: python -m pip install -r requirements.txt -r requirements-dev.txt - run: mkdocs gh-deploy --force diff --git a/README.md b/README.md index 1561d1dd..943681c1 100644 --- a/README.md +++ b/README.md @@ -1,15 +1,14 @@ -**This project is no longer actively maintained by the core team. Please reach out to [@terrytangyuan](https://github.com/terrytangyuan) for any questions or potential collaborations.** - # Couler ## What is Couler? -Couler aims to provide a unified interface for constructing and managing workflows on -different workflow engines, such as [Argo Workflows](https://github.com/argoproj/argo-workflows), [Tekton Pipelines](https://tekton.dev/), and [Apache Airflow](https://airflow.apache.org/). +* Couler is a system designed for unified machine learning workflow optimization in the cloud. Couler endeavors to provide a unified interface for constructing and optimizing workflows across various workflow engines, such as [Argo Workflows](https://github.com/argoproj/argo-workflows), [Tekton Pipelines](https://tekton.dev/), and [Apache Airflow](https://airflow.apache.org/). Couler enhances workflow efficiency through features like Autonomous Workflow Construction, Automatic Artifact Caching Mechanisms, Big Workflow Auto Parallelism Optimization, and Automatic Hyperparameters Tuning. +* Couler is included in [CNCF Cloud Native Landscape](https://landscape.cncf.io/) and [LF AI Landscape](https://landscape.lfai.foundation). +* The technical report for Couler can be found at the following link: [Techinal-Report](https://github.com/couler-proj/couler/tree/master/docs/Technical-Report-of-Couler) -Couler is included in [CNCF Cloud Native Landscape](https://landscape.cncf.io/) and [LF AI Landscape](https://landscape.lfai.foundation). +> Note that while one of ambitious goals of Couler is to support multiple workflow engines, Couler currently only supports Argo Workflows as the workflow orchestration backend. An ambitious goal of Couler is to provide support for multiple workflow engines. While it initially supported only Argo Workflows for workflow orchestration, efforts are now underway to extend support to other workflow engines such as Tekton Pipelines and Apache Airflow. +> In addition, if you are looking for a Python SDK that provides access to all the available features from Argo Workflows, you might want to check out [the low-level Python SDK maintained by the Argo Workflows team](https://argoproj.github.io/argo-workflows/client-libraries/). -> Note that while one of ambitious goals of Couler is to support multiple workflow engines, Couler currently only supports Argo Workflows as the workflow orchestration backend. In addition, if you are looking for a Python SDK that provides access to all the available features from Argo Workflows, you might want to check out [the low-level Python SDK maintained by the Argo Workflows team](https://argoproj.github.io/argo-workflows/client-libraries/). ## Who uses Couler? @@ -104,12 +103,12 @@ def recursive(): -Couler provides a unified interface for constructing and managing workflows that provides the following: +Couler is a system for unified Mechine Learning (ML) workflow optimization in cloud and the contributions are outlined below:: -* Simplicity: Unified interface and imperative programming style for defining workflows with automatic construction of directed acyclic graph (DAG). -* Extensibility: Extensible to support various workflow engines. -* Reusability: Reusable steps for tasks such as distributed training of machine learning models. -* Efficiency: Automatic workflow and resource optimizations under the hood. +* Simplicity and Extensibility: Couler provides a unified programming interface for workflow definition, ensuring independence from the workflow engine and compatibility with various workflow engines such as Argo Workflows, Airflow, and Tekton. +* Automation: Couler integrates LLMs in unified programming code generation. By leveraging LLMs, Couler facilitates the generation of unified programming code using NL descriptions. Additionally, we automate hyperparameters tuning through the integration of Dataset Card and Model Card, enhancing the effectiveness of the autoML process. +* Efficiency: Couler introduces the Intermediate Representative (IR) to depict the workflow Directed Acyclic Graph (DAG), optimizing extensive workflow computations by dividing a large workflow into smaller ones for auto-parallelism optimization. Couler also implements dynamic caching of artifacts, which are the outputs of jobs in the workflow, to minimize redundant computations and ensure fault tolerance. +* Open Source Community: The released open-source version of Couler has garnered adoption from multiple companies and end-users. For instance, over 3000 end users are utilizing Couler within Ant Group, and more than 20 companies have adopted Couler as their default workflow engine interface. Please see the following sections for installation guide and examples. @@ -244,3 +243,16 @@ any feedback and contributions from the community. * [Introducing Couler: Unified Interface for Constructing and Managing Workflows, Argo Workflows Community Meeting](https://docs.google.com/presentation/d/11KVEkKQGeV3R_-nHdqlzQV2uOrya94ra6Ilm_k6RwE4/edit?usp=sharing) * [Authoring and Submitting Argo Workflows using Python](https://blog.argoproj.io/authoring-and-submitting-argo-workflows-using-python-aff9a070d95f) + +## Citation + +Please cite the repo if you use the code in this repo. +```bibtex +@misc{Couler, + author = {Xiaoda Wang, Yuan Tang, Tengda Guo, Bo Sang, Jingji Wu, Jian Sha, Ke Zhang, Jiang Qian, Mingjie Tang}, + title = {Couler: Unified Machine Learning Workflow Optimization in Cloud}, + year = {2023}, + publisher = {GitHub}, + howpublished = {\url{https://github.com/couler-proj/couler}}. +} +``` diff --git a/docs/Technical-Report-of-Couler/Couler_Optimizing-Machine-Learning-Workflows-in-Cloud.pdf b/docs/Technical-Report-of-Couler/Couler_Optimizing-Machine-Learning-Workflows-in-Cloud.pdf new file mode 100644 index 00000000..2c1b6fd2 Binary files /dev/null and b/docs/Technical-Report-of-Couler/Couler_Optimizing-Machine-Learning-Workflows-in-Cloud.pdf differ diff --git a/docs/Technical-Report-of-Couler/README.md b/docs/Technical-Report-of-Couler/README.md new file mode 100644 index 00000000..694ac9ae --- /dev/null +++ b/docs/Technical-Report-of-Couler/README.md @@ -0,0 +1 @@ +In this technical report, we delve into "Couler: Optimizing Machine Learning Workflows in Cloud", a framework designed to streamline the construction and execution of machine learning workflows. The report is segmented into three comprehensive chapters: Unified Programming Model, Implementation and Running Example.