A Framework for Automatic Bundling Machine Learning Microservices
Business Goals
- The customer wanted to improve MPT obfuscation approaches, partial execution support, virtual environments obfuscation, greater security and make it more maintainable and readable. All that was required for convenient usage of MPT inside company by its ML and data science engineers.
Challenge
- A full-stack AI biotech provider faced challenges with distribution of developed models and algorithms as backbox without access to its background IP.
Results
- MPT source code refactored
All existing code written in functional programming approach was redesigned into more modular reusable OOP code, easier for understanding and maintenance. Also, code linters and formatters added for code quality validation. - Modular pipeline approach implemented
Creating each step of obfuscation process as part of pipeline allowed to enable partial execution as well as mutual changeability of steps and easier future steps creation. - Execution YAML configuration template added
End-users can create custom pipeline configuration files in YAML format using base template. Configuration template was insipired by GitHub workflow templates. - Cross-platform docker builds
Users can package solution regardless host platform. That allows to create docker containers with python code compiled into binary files not depending on host OS. - File encryption as part of general pipeline
Manual encryption step was merged with general pipeline that allowed to encrypt and obfuscate code using single entry point with basic automated code patching for simple encrypted file usage cases. - Test cases improved
E2e, integration and unit test cases were developed for covering all main obfuscation flows and security requirements. Code coverage over 90% was achieved.
Implementation Details
- Code refactoring and formatting - 1.25 weeks
- Originally all code was present in 2 files with multiple duplicated functionalities. All that functionality was refactored in 2 stages: without and with business logic modification. The main aim was made on modularity and code readability for easier future support.
- Code linters and formatters were added: pylint, bandit, mypy, black for single code-style and code quality.
- Modular pipeline approach - 0.5 weeks
- A more mature version of Strategy pattern was used called Pipeline-Action approach. All actions that can be made with target solution were presented as steps of that pipeline.
- Separate pipeline configuration service for creating steps based on user-provided configuration.
- Common state was shared between steps for unstable data transfer.
- Execution YAML configuration template added - 0.25 week
- A custom configuration template with detailed format validation was created.
- Yaml file structure was inspired by GitHub actions, so that every step has its own options and can be executed separately.
- Cross-platform builds and DinD mode - 1 week
- Main issue of previous version of MPT was Linux lock. Due to compilation OS of dockerized solution and host OS should be the same for successful binary file execution. For that docker-in-docker mode was added that allowed to obfuscate target code as a stage of docker container creation.
- Docker in docker mode also made it possible for environment obfuscation feature to be implemented.
- File encryption as part of general pipeline - 0.25 week
- Original file encryption was a separate step with manual code patching required. Basic patching was automated based on regex and some extra decryption module.
- Encryption was logic transformed into step so that it could be used optionally.
- Backward compatibility with manual steps was saved using partial execution, so that for advanced cases when human interaction required it can be done.
- Automation test case development - 0.75 week
- Test cases were developed for testing each step separately using unit tests, for integration with docker, pyarmor, nuitka integration tests were added and finally for testing general tools usage end-to-end tests was created. They included different scenarios of obfuscation with or without encryption and with background IP searching in final binary files.
Industry
Service
Type
- Template Solution
Keywords
- Security Engineering
- Reverse Engineering
- Code obfuscation
- Containerization
- CLI and Python package Development
Roadmap
/*=
$user_is_authed
? declense_numeral(get_field('duration'), 'month', 'months')
: 'X months';
*/ ?>
Code refactoring and formatting
Modular pipeline approach
Execution YAML configuration template added
Cross-platform builds and DinD mode
File encryption as part of general pipeline
Automation test case development
Sign up to receive the project description
Want to talk?
Michael Gurbych
Director,
Operations and Finance
Operations and Finance
Roadmap
/*=
$user_is_authed
? declense_numeral(get_field('duration'), 'month', 'months')
: 'X months';
*/ ?>
Code refactoring and formatting
Modular pipeline approach
Execution YAML configuration template added
Cross-platform builds and DinD mode
File encryption as part of general pipeline
Automation test case development