Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complete unclear dependencies for Ianvs #132

Open
FuryMartin opened this issue Aug 13, 2024 · 4 comments
Open

Complete unclear dependencies for Ianvs #132

FuryMartin opened this issue Aug 13, 2024 · 4 comments
Labels
kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt.

Comments

@FuryMartin
Copy link
Contributor

FuryMartin commented Aug 13, 2024

What should be cleaned up or changed:

Complete the dependencies

The current Ianvs project has issues with unclear and incomplete dependencies, as pointed out in issues #103 and #106.

This poses significant challenges for users to configure the Ianvs environment.

I suggest:

Remove examples/resources/

There are also some dependencies within projects under examples/.

Since these projects are not closely related to Ianvs Core, why should their dependencies be placed in examples/resources?

We can provide an intuitive requirements.txt for each example, which aligns with Python usage habits and helps reduce the introduction of large binary packages into the Git repository.

I suggest:

  • Remove the method of managing third-party dependencies through examples/resources/, and embed dependencies of examples/xxxxx into a requirements.txt within each example directory.

Why is this needed:
Improving dependencies and aligning the new version interface will help Ianvs solve complex legacy dependency issues, allowing users to quickly get started with Ianvs.

@FuryMartin
Copy link
Contributor Author

FuryMartin commented Aug 15, 2024

Considering that the API of dependencies in different versions of Python may vary significantly, we need to determine the minimum Python version supported by Ianvs.

Experiment on Quick Start - The PCB-AoI Example

I followed the Quick Start guide to run the PCB-AoI example. However, during the attempt, I found that this example currently has an irreconcilable dependency conflict.

I will provide the results of dependency configuration for different Python version environments:

Python 3.6

Step 1. Ianvs Preparation

We first need to install the sedna-0.4.1-py3-none-any.whl package located in examples/resources/third_party/.
In this step, we need to use the following command to complete the dependency installation:

pip install ./examples/resources/third_party/*
pip install -r requirements.txt

However, Sedna-0.4.1 specifies a dependency on joblib~=1.2.0, which requires Python>=3.7, so it cannot be installed and the sedna-0.4.1-py3-none-any.whl must be updated.

$ pip install examples/resources/third_party/sedna-0.4.1-py3-none-any.whl 
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Processing ./examples/resources/third_party/sedna-0.4.1-py3-none-any.whl
Collecting six~=1.15.0
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/ee/ff/48bde5c0f013094d729fe4b0316ba2a24774b3ff1c52d924a8a4cb04078a/six-1.15.0-py2.py3-none-any.whl (10 kB)
Collecting PyYAML
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/62/2a/df7727c52e151f9e7b852d7d1580c37bd9e39b2f29568f0f81b29ed0abc2/PyYAML-6.0.1-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (677 kB)
     |████████████████████████████████| 677 kB 3.2 MB/s 
Collecting setuptools~=54.2.0
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/9e/d4/b99a960314121a003e9f39c61dfde01a1010bb47661e193a7722f7f32d52/setuptools-54.2.0-py3-none-any.whl (785 kB)
ERROR: Could not find a version that satisfies the requirement joblib~=1.2.0 (from sedna) (from versions: 0.3.2d.dev, 0.3.2e.dev, 0.3.2f.dev, 0.3.2g.dev, 0.7.0d, 0.1a0.dev0, 0.2a0.dev0, 0.3a0.dev0, 0.3.1a0.dev0, 0.3.2.dev0, 0.3.2a0.dev0, 0.3.2b0.dev0, 0.3.2rc0.dev0, 0.3.3a0.dev0, 0.3.3b0.dev0, 0.3.3rc0.dev0, 0.3.4.dev0, 0.3.5.dev0, 0.3.6.dev0, 0.3.7.dev0, 0.4.0.dev0, 0.4.1.dev0, 0.4.2.dev0, 0.4.3.dev0, 0.4.4.dev0, 0.4.5.dev0, 0.4.6.dev0, 0.5.0.dev0, 0.5.0a0.dev0, 0.5.1.dev0, 0.5.2.dev0, 0.5.3.dev0, 0.5.4.dev0, 0.5.5.dev0, 0.5.6.dev0, 0.5.7.dev0, 0.5.7a0.dev0, 0.5.7b0.dev0, 0.5.7, 0.6.0a0, 0.6.0b0, 0.6.0b2, 0.6.0b3, 0.6.0, 0.6.1, 0.6.2, 0.6.3, 0.6.4, 0.6.5, 0.7.0a0, 0.7.0b0, 0.7.0rc0, 0.7.1, 0.8.0a0, 0.8.0a2, 0.8.0a3, 0.8.0, 0.8.1, 0.8.2, 0.8.3, 0.8.3.post1, 0.8.4, 0.9.0b2, 0.9.0b3, 0.9.0b4, 0.9.1, 0.9.2, 0.9.3, 0.9.4, 0.10.0, 0.10.2, 0.10.3, 0.11a3, 0.11, 0.12.0, 0.12.1, 0.12.2, 0.12.3, 0.12.4, 0.12.5, 0.13.0, 0.13.1, 0.13.2, 0.14.0, 0.14.1, 0.15.0, 0.15.1, 0.16.0, 0.17.0, 1.0.0, 1.0.1, 1.1.0a0, 1.1.0, 1.1.1)
ERROR: No matching distribution found for joblib~=1.2.0

Step 2. Dataset and Model Preparation

Skiped because Step 1 Failed.

Python 3.7

Step 1. Ianvs Preparation

Since Python 3.7 is compatible with joblib~=1.2.0, no related issues were encountered. This step was completed successfully.

Step 2. Dataset and Model Preparation

In this step, we need to use the following command to install dependencies for PCB-AoI:

pip install examples/resources/algorithms/FPN_TensorFlow-0.1-py3-none-any.whl

However, a new problem has arisen: the package FPN_TensorFlow-0.1 requires dataclasses~=0.8.

This dependency is the original implementation of dataclass, which requires Python>=3.6, <3.7. Higher versions of Python have integrated dataclass into the standard library.

Therefore, we are unable to complete the installation of FPN_TensorFlow-0.1-py3-none-any.whl.

ERROR: Ignored the following versions that require a different python version: 0.7 Requires-Python >=3.6, <3.7; 0.8 Requires-Python >=3.6, <3.7; 10.0.0 Requires-Python >=3.8; 10.0.1 Requires-Python >=3.8; 10.1.0 Requires-Python >=3.8; 10.2.0 Requires-Python >=3.8; 10.3.0 Requires-Python >=3.8; 10.4.0 Requires-Python >=3.8; 3.6.0 Requires-Python >=3.8; 3.6.0rc1 Requires-Python >=3.8; 3.6.0rc2 Requires-Python >=3.8; 3.6.1 Requires-Python >=3.8; 3.6.2 Requires-Python >=3.8; 3.6.3 Requires-Python >=3.8; 3.7.0 Requires-Python >=3.8; 3.7.0rc1 Requires-Python >=3.8; 3.7.1 Requires-Python >=3.8; 3.7.2 Requires-Python >=3.8; 3.7.3 Requires-Python >=3.8; 3.7.4 Requires-Python >=3.8; 3.7.5 Requires-Python >=3.8; 3.8.0 Requires-Python >=3.9; 3.8.0rc1 Requires-Python >=3.9; 3.8.1 Requires-Python >=3.9; 3.8.2 Requires-Python >=3.9; 3.8.3 Requires-Python >=3.9; 3.8.4 Requires-Python >=3.9; 3.9.0 Requires-Python >=3.9; 3.9.0rc2 Requires-Python >=3.9; 3.9.1 Requires-Python >=3.9; 3.9.1.post1 Requires-Python >=3.9; 3.9.2 Requires-Python >=3.9
ERROR: Could not find a version that satisfies the requirement dataclasses~=0.8 (from fpn-tensorflow) (from versions: 0.1, 0.2, 0.3, 0.4, 0.5, 0.6)
ERROR: No matching distribution found for dataclasses~=0.8

Python 3.8 and higher

Step 1. Ianvs Preparation

During the installation of sedna-0.4.1 using the following command, an issue with uvicorn~=0.14.0 was encountered.

$ pip install examples/resources/third_party/sedna-0.4.1-py3-none-any.whl

WARNING: Ignoring version 0.14.0 of uvicorn since it has invalid metadata:
Requested uvicorn~=0.14.0 from https://pypi.tuna.tsinghua.edu.cn/packages/bf/fe/a41994c92897b162c0c83e8ef10bec54ebdefbce3f3725b530d2091492ac/uvicorn-0.14.0-py3-none-any.whl (from sedna==0.4.1) has invalid metadata: .* suffix can only be used with `==` or `!=` operators
    click (>=7.*)
           ~~~~^
Please use pip<24.1 if you need to use this version

ERROR: Could not find a version that satisfies the requirement uvicorn~=0.14.0 (from sedna) (from versions: 0.0.1, 0.0.2, 0.0.3, 0.0.4, 0.0.5, 0.0.6, 0.0.7, 0.0.8, 0.0.9, 0.0.10, 0.0.11, 0.0.12, 0.0.13, 0.0.14, 0.0.15, 0.1.0, 0.1.1, 0.2.0, 0.2.1, 0.2.2, 0.2.3, 0.2.4, 0.2.5, 0.2.6, 0.2.7, 0.2.8, 0.2.9, 0.2.10, 0.2.11, 0.2.12, 0.2.13, 0.2.14, 0.2.15, 0.2.16, 0.2.17, 0.2.18, 0.2.19, 0.2.20, 0.2.21, 0.2.22, 0.3.0, 0.3.1, 0.3.2, 0.3.3, 0.3.4, 0.3.5, 0.3.6, 0.3.7, 0.3.8, 0.3.9, 0.3.10, 0.3.11, 0.3.12, 0.3.13, 0.3.14, 0.3.15, 0.3.16, 0.3.17, 0.3.18, 0.3.19, 0.3.20, 0.3.21, 0.3.22, 0.3.23, 0.3.24, 0.3.25, 0.3.26, 0.3.27, 0.3.28, 0.3.29, 0.3.30, 0.3.31, 0.3.32, 0.4.0, 0.4.1, 0.4.2, 0.4.3, 0.4.4, 0.4.5, 0.4.6, 0.5.0, 0.5.1, 0.5.2, 0.6.0, 0.6.1, 0.7.0b1, 0.7.0b2, 0.7.0, 0.7.1, 0.7.2, 0.7.3, 0.8.0, 0.8.1, 0.8.2, 0.8.3, 0.8.4, 0.8.5, 0.8.6, 0.9.0, 0.9.1, 0.10.0, 0.10.1, 0.10.2, 0.10.3, 0.10.4, 0.10.5, 0.10.6, 0.10.7, 0.10.8, 0.10.9, 0.11.0, 0.11.1, 0.11.2, 0.11.3, 0.11.4, 0.11.5, 0.11.6, 0.11.7, 0.11.8, 0.12.0, 0.12.1, 0.12.2, 0.12.3, 0.13.0, 0.13.1, 0.13.2, 0.13.3, 0.13.4, 0.14.0, 0.15.0, 0.16.0, 0.17.0.post1, 0.17.1, 0.17.2, 0.17.3, 0.17.4, 0.17.5, 0.17.6, 0.18.0, 0.18.1, 0.18.2, 0.18.3, 0.19.0, 0.20.0, 0.21.0, 0.21.1, 0.22.0, 0.23.0, 0.23.1, 0.23.2, 0.24.0, 0.24.0.post1, 0.25.0, 0.26.0, 0.27.0, 0.27.0.post1, 0.27.1, 0.28.0, 0.28.1, 0.29.0, 0.30.0, 0.30.1, 0.30.2, 0.30.3, 0.30.4, 0.30.5, 0.30.6)
ERROR: No matching distribution found for uvicorn~=0.14.0

After investigation, this is due to the removal of some old interfaces in the new version of pip.

A temporary solution is to clone the sedna repository locally and manually change the version of the uvicorn package in lib/requirements.txt to 0.15.0.

This issue has been reported to the Sedna project; for more details, please check kubeedge/sedna#440.

Step 2. Dataset and Model Preparation

Apart from the issue with dataclasses~=0.8, FPN_TensorFlow-0.1-py3-none-any.whl also requires tensorflow~=1.14.0.

However, according to the version information on PyPI/Tensorflow, tensorflow==1.14.0 was released on June 19, 2019, and only supports versions of Python<=3.7. For versions above Python>3.7, it is necessary to use a version of tensorflow>=2.0.0.

If you attempt to install it in Python>=3.8 , the error message will be as follows:

$ pip install examples/resources/algorithms/FPN_TensorFlow-0.1-py3-none-any.whl

Processing ./examples/resources/algorithms/FPN_TensorFlow-0.1-py3-none-any.whl
Collecting wheel~=0.36.2 (from FPN-TensorFlow==0.1)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/65/63/39d04c74222770ed1589c0eaba06c05891801219272420b40311cd60c880/wheel-0.36.2-py2.py3-none-any.whl (35 kB)
Collecting libs~=0.0.10 (from FPN-TensorFlow==0.1)
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/60/65/01475e4fbf0d7539378019983c6d258e435ffbbfb497ceb6ec7fbea83eed/libs-0.0.10-py3-none-any.whl (5.8 kB)
INFO: pip is looking at multiple versions of fpn-tensorflow to determine which version is compatible with other requirements. This could take a while.
ERROR: Could not find a version that satisfies the requirement tensorflow~=1.14.0 (from fpn-tensorflow) (from versions: 2.2.0, 2.2.1, 2.2.2, 2.2.3, 2.3.0, 2.3.1, 2.3.2, 2.3.3, 2.3.4, 2.4.0, 2.4.1, 2.4.2, 2.4.3, 2.4.4, 2.5.0, 2.5.1, 2.5.2, 2.5.3, 2.6.0rc0, 2.6.0rc1, 2.6.0rc2, 2.6.0, 2.6.1, 2.6.2, 2.6.3, 2.6.4, 2.6.5, 2.7.0rc0, 2.7.0rc1, 2.7.0, 2.7.1, 2.7.2, 2.7.3, 2.7.4, 2.8.0rc0, 2.8.0rc1, 2.8.0, 2.8.1, 2.8.2, 2.8.3, 2.8.4, 2.9.0rc0, 2.9.0rc1, 2.9.0rc2, 2.9.0, 2.9.1, 2.9.2, 2.9.3, 2.10.0rc0, 2.10.0rc1, 2.10.0rc2, 2.10.0rc3, 2.10.0, 2.10.1, 2.11.0rc0, 2.11.0rc1, 2.11.0rc2, 2.11.0, 2.11.1, 2.12.0rc0, 2.12.0rc1, 2.12.0, 2.12.1, 2.13.0rc0, 2.13.0rc1, 2.13.0rc2, 2.13.0, 2.13.1)
ERROR: No matching distribution found for tensorflow~=1.14.0

Summary

After testing several versions of Python, none of them can run the existing Quick Start example.

The situation demonstrates that outdated dependencies has already led to conflicts,indicating that resolving the dependency issues for Ianvs is urgent and a sound mechanism for updating dependency versions needs to be established.

@FuryMartin
Copy link
Contributor Author

FuryMartin commented Aug 16, 2024

I conducted more experiments and found that things are a little tricky.

Dependencies conflicts between Ianvs Core and Examples/

Problems of Sedna

Ianvs Core needs sedna as a dependency. sedna-0.4.1-py3-none-any.whl describes its dependencies as follow:

numpy>=1.13.3  # BSD
colorlog~=4.7.2  # MIT
websockets~=9.1  # BSD
requests>=2.24.0  # Apache-2.0
PyYAML  # MIT
setuptools~=54.2.0
fastapi~=0.68.1  # MIT
pydantic>=1.8.1  # MIT
tenacity~=8.0.1  # Apache-2.0
joblib~=1.2.0  # BSD
pandas  # BSD
six~=1.15.0  # MIT
minio~=7.0.3  # Apache-2.0
uvicorn~=0.14.0  # BSD
pycocotools

The use of ~= locks some dependencies to older versions, while other dependencies in the user's example require newer versions. This creates significant conflicts.

For instance, sedna requires fastapi~=0.68.1, and pydantic is a dependency of fastapi. This restriction necessitates satisfying pydantic!=1.7,!=1.7.1,!=1.7.2,!=1.7.3,!=1.8,!=1.8.1,<2.0.0,>=1.6.2.

However, some new examples (such as vllm in #122) require a dependency on pydantic >=2.0, which leads to severe conflicts and causes installation errors with pip.

Installing collected packages: pydantic
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
sedna 0.4.1 requires fastapi~=0.68.1, but you have fastapi 0.112.1 which is incompatible.

This is mainly due to two reasons:

  • Ianvs adopts the offline sedna-0.4.1-py3-none-any.whl installation wheel, which requires step-by-step environment configuration and prevents pip from resolving multiple package conflicts at once.
  • Sedna uses formats like ~= to describe dependency versions, causing some unimportant packages (such as uvicorn, fastapi, colorlog, websockets, etc.) to a fixed old version, leading to conflicts with the new packages introduced by users in the examples/, making it completely unusable for them.

Additionally, while fixing Sedna's requirements.txt, I found that although the Latest Version of Sedna retains the version number 0.4.1 in sedna/lib, some of its interfaces have changed significantly.

In the current Ianvs provided sedna-0.4.1-py3-none-any.whl, LifeLongLearning is defined as follows:

class LifelongLearning(JobBase):
    def __init__(self,
                 estimator,
                 task_definition=None,
                 task_relationship_discovery=None,
                 task_allocation=None,
                 task_remodeling=None,
                 inference_integrate=None,
                 task_update_decision=None,
                 unseen_task_allocation=None,
                 unseen_sample_recognition=None,
                 unseen_sample_re_recognition=None
                 ):

However, in the latest version of Sedna, LifelongLearning is defined as follows:

class LifelongLearning(JobBase):
    def __init__(self,
                 seen_estimator,
                 unseen_estimator=None,
                 task_definition=None,
                 task_relationship_discovery=None,
                 task_allocation=None,
                 task_remodeling=None,
                 inference_integrate=None,
                 task_update_decision=None,
                 unseen_task_allocation=None,
                 unseen_sample_recognition=None,
                 unseen_sample_re_recognition=None
                 ):

Notice that the original interface changed from estimator to seen_estimator, unseen_estimator=None.

This directly caused the LifeLongLearning algorithm in Ianvs to be unable to run on Latest Version of Sedna, as its interface call is consistent with sedna-0.4.1-py3-none-any.whl.

if paradigm_type == ParadigmType.LIFELONG_LEARNING.value:
return LifelongLearning(
estimator=self.module_instances.get(
ModuleType.BASEMODEL.value),
task_definition=self.module_instances.get(
ModuleType.TASK_DEFINITION.value),
task_relationship_discovery=self.module_instances.get(
ModuleType.TASK_RELATIONSHIP_DISCOVERY.value),
task_allocation=self.module_instances.get(
ModuleType.TASK_ALLOCATION.value),
task_remodeling=self.module_instances.get(
ModuleType.TASK_REMODELING.value),
inference_integrate=self.module_instances.get(
ModuleType.INFERENCE_INTEGRATE.value),
task_update_decision=self.module_instances.get(
ModuleType.TASK_UPDATE_DECISION.value),
unseen_task_allocation=self.module_instances.get(
ModuleType.UNSEEN_TASK_ALLOCATION.value),
unseen_sample_recognition=self.module_instances.get(
ModuleType.UNSEEN_SAMPLE_RECOGNITION.value),
unseen_sample_re_recognition=self.module_instances.get(
ModuleType.UNSEEN_SAMPLE_RE_RECOGNITION.value)
)

The offline installation method even prevents Ianvs from syncing with the new version of Sedna.

Problems of Examples

Examples also have dependencies issues.

If you check PCB-AoI or other examples, you may find inside their README.md, there is usually an extra configuration guide for dependencies. This can be quite complex, involving the installation of offline wheels or self-modified packages.

What's more, different examples typically have varying dependencies, leading to complex dependency issues both between the example and Ianvs Core / Sedna, and among the examples themselves.

A Possible solution

Addressing these issues requires changes to the overall architecture of Ianvs, which may involve:

  • Separating core and examples into different repositories. The dependency differences among different examples are significant, and they target very specific scenarios, making it difficult for general users to benefit from these examples in getting started. I know that some examples may stem from related businesses, but if we want to expand the influence and usability of Ianvs as an open-source project, Ianvs might need to consider only retaining very general and user-friendly examples as a Quick Start, such as basic CV and NLP tasks. The related issue is [ADVICE]Add a simple QuickStart Example #103.
  • Distributing ianvs and Sedna via PyPI, avoiding dependency issues caused by offline installation packages. This is a common practice among mainstream open-source projects and aligns better with user habits.

This plan may be somewhat radical. The final solution must be fully discussed within the community, which is beyond my current ability to resolve. If you're interested in addressing this issue, please feel free to comment.

@FuryMartin
Copy link
Contributor Author

Considering that I am working on the implementation of #96, I will temporarily put this issue on hold.

@FuryMartin FuryMartin changed the title Complete the dependencies and upgrade the interface to the new version. Complete the dependencies. Aug 16, 2024
@FuryMartin FuryMartin changed the title Complete the dependencies. Complete the dependencies Aug 16, 2024
@FuryMartin FuryMartin changed the title Complete the dependencies Complete unclear dependencies Aug 16, 2024
@FuryMartin FuryMartin changed the title Complete unclear dependencies Complete unclear dependencies for Ianvs Aug 16, 2024
@MooreZheng MooreZheng added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. and removed kind/feature Categorizes issue or PR as related to a new feature. labels Aug 29, 2024
@hsj576
Copy link
Member

hsj576 commented Aug 30, 2024

It's a good proposal, and I think we can discuss it at the regular community meeting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt.
Projects
None yet
Development

No branches or pull requests

3 participants