Releases: octue/octue-sdk-python
Add documentation on updating Octue services
Contents (#683)
Operations
- Use latest
ruff
pre-commit check
Dependencies
- Add
ruff
to dev dependencies
Other
- Add doc on updating an Octue service
Switch to ruff developer tooling
Contents (#682)
Operations
- Switch from
flake8
,black
, andisort
toruff
Dependencies
- Remove old formatters/linters and add
ruff
config
Refactoring
- Apply
ruff
to all files
Check for service revision existence
Contents (#680)
IMPORTANT: There is 1 breaking change.
Enhancements
- 💥 BREAKING CHANGE: Use cloud URIs by default for datasets in output manifests
- Add comments around checking for service revision existence
- Improve error when
octue.services
topic doesn't exist
Fixes
- Raise error if service revision subscription doesn't exist when no service registry is in use
- Remove
octue.services
prefix from subscription names
Refactoring
- Avoid repeated conversion to Pub/Sub ID for a service
Upgrade instructions
💥 Use cloud URIs by default for datasets in output manifests
Set use_signed_urls_for_output_datasets
to True
in the app configuration to keep using signed URLs for datasets in output manifests.
Revert analysis output location removal
Contents (#677)
Fixes
- Pass output arguments into
Analysis
and use them
Reversions
- Revert "REF: Stop storing
output_location
inAnalysis
"
Make signed URLs for output datasets optional
Contents (#676)
IMPORTANT: There is 1 breaking change.
Enhancements
- Allow using non-signed URLs for datasets in output manifest (controllable via the app configuration file)
- Handle all
requests
errors while:- Getting cloud metadata for datafiles and datasets
- Downloading datafiles
Fixes
- Avoid trying to access buckets for URL datasets
Refactoring
- 💥 BREAKING CHANGE: Stop storing
output_location
inAnalysis
- Remove unnecessary finalisation from template apps
Upgrade instructions
💥 Stop storing `output_location` in `Analysis`
If calling Analysis.finalise
manually, either stop doing this and rely on the output_location
field of the app configuration or explicitly pass in the upload_output_datasets_to
argument.
Improve event filtering
Contents (#673)
IMPORTANT: There are 2 breaking changes.
New features
- Add
dictionary_product
utility function
Enhancements
- 💥 BREAKING CHANGE: Disable event validation in
EventReplayer
by default - 💥 BREAKING CHANGE: Enable filtering by multiple event kinds in
get_events
- Enable excluding multiple event kinds in
get_events
- Use all non-question events for question redelivery check in flask app
- Add ability to skip handling logs containing certain text in
AbstractEventHandler
and subclasses - Return outside of
ThreadPoolExecutor
context managers
Upgrade instructions
💥 Disable event validation in `EventReplayer` by default
Set validate_events=True
in the EventReplayer
constructor to retain the previous behaviour.
💥 Enable filtering by multiple event kinds in `get_events`
To filter by one event kind as before, use kinds=[event_kind]
instead of kind=event_kind
.
Enable question retries on single questions
Summary
This release adds to the question retry capability already available on concurrent questions by allowing retries of single questions.
Contents (#671)
Enhancements
- Enable question retries on single questions with
Child.ask
- Log when retries are prevented for an exception type in
Child.ask
- Remove
PYTHONUNBUFFERED
warning
Refactoring
- Move retry logic from
Child.ask_multiple
intoChild.ask
Speed up event replaying
Contents (#669)
Enhancements
- Skip non-result event validation if only result is required
- Add ability to skip event validation in event handlers
- Make diagnostics log messages more consistent
- Allow instantiation of
Diagnostics
,Topic
,Subscription
, andGoogleCloudPubSubEventHandler
without cloud credentials
Refactoring
- Update from deprecated
datetime.datetime.utcnow
method - Use
cached_property
inService
- Remove unused attributes on
MockService
andRunner
Testing
- Implement
MockSubscription.delete
Update child emulator and improve manifest dataset download
Contents (#668)
IMPORTANT: There are 2 breaking changes.
Enhancements
- 💥 BREAKING CHANGE: Update
ChildEmulator
to useEventReplayer
, support schema-compliant events and attributes, and support heartbeats and delivery acknowledgement events. This significantly simplifies the emulator - 💥 BREAKING CHANGE: Remove
ChildEmulator.from_file
- Download manifest datasets to same directory by default
Refactoring
- Move
ServicePatcher
into its own module
Upgrade instructions
💥 Update `ChildEmulator` to use `EventReplayer` and full events
Give events (including attributes) that satisfy the service communication schema to child emulators.
💥 Remove `ChildEmulator.from_file`
Load the JSON file separately and pass the events into the ChildEmulator
constructor.
Enable question chaining
Summary
This release makes major improvements to event handling and question auditing. Some of the main changes are:
- Questions are now automatically associated with their parent question and the question that originated them, however deep they are in a question tree
- Events are ordered by datetime by the event backend, not the SDK
- Better feedback is provided when asking questions in parallel
- You can specify the event store to use
- Log message contexts have been slimmed down without losing any information, and events are replayable with no context (good for smaller screens)
- Various public classes and functions are faster and easier to use
- Question retries have the same question UUID
Contents (#660)
IMPORTANT: There are 6 breaking changes.
New features
Events
- 💥 BREAKING CHANGE: Add
parent_question_uuid
,originator_question_uuid
,originator
andretry_count
event attributes - Avoid redelivery of questions by checking the event store on delivery
Event handlers
- Add ability to not include service metadata in logs in even handlers
- Enable
EventReplayer
to handle question events - Add
RegisteredTemporaryDirectory
class, use it when downloading datasets, and add ability to delete them at end of analysis
Enhancements
Resources
- 💥 BREAKING CHANGE: Make datasets recursive by default in
Dataset
- Log a warning if a dataset is empty at instantiation
Services
- 💥 BREAKING CHANGE: Remove
name
argument fromService
and provide an SRUID toChild
internal service instead of a name - Improve logging of errors, retries, and threading in
Child.ask_multiple
- Order pub/sub messages by datetime using ordering key and remove
order
event attribute - Set question UUIDs in advance in
Child.ask_multiple
Subscriptions
- Allow existing subscriptions in
create_push_subscription
- Give feedback on (un)successful push subscription creation in CLI
Questions and events
- Remove unnecessary
sender
argument fromget_events
and make getting the tail of events the default - Allow retried questions to have the same UUID
- Allow explicit question retries by using
retry_count
attribute - Return empty list from
get_events
if no events for question
Service configuration
- Allow setting of event store table ID and
delete_local_files
in service configuration - Use envvar to specify service configuration location by default
- Add
overrides
option toRunner.from_configuration
Other
- Log warning when
PYTHONUNBUFFERED
envvar is unset - Remove "analysis-" from start of question UUIDs in log context
Fixes
- 💥 BREAKING CHANGE: Return question UUID alongside error from
Child.ask_multiple
for failed questions - Set analysis ID at start of
Runner.run
- Emit correct logs when no diagnostics available with
octue get-diagnostics
- Fix deserialisation of events in
get_events
- Use (meta-)generation agnostic retry strategy with cloud storage
- Return correct question UUIDs with failed questions from
Child.ask_multiple
- Avoid logging that app failed when it didn't when uploading diagnostics
- Allow setting of
max_workers
when CPU count is indeterminate - Disable
delete_local_files
by default
Operations
- Update event handler and its bigquery table
Dependencies
- Loosen
Sphinx
and other docs package ranges - Remove unneeded
db-dtypes
package - Make
google-cloud-bigquery
a mandatory dependency - Upgrade
google-cloud-secret-manager
Refactoring
Event handlers
- 💥 BREAKING CHANGE: Remove redundant datetime from delivery ack and heartbeat events
- 💥 BREAKING CHANGE: Rename
originator
event attribute toparent
- Factor out finalising and cleaning up in
Runner
- Move service accounts into separate terraform file
- Cache metadata against datafile/dataset instead of path
- Rename python3.9 dockerfile to reflect its python version
Upgrade instructions
- Add
recursive=False
toDataset
instantiations - Update all services in your service network to use
octue>=0.56.0
- Use version
0.6.1
of the event handler or above and a correspondingly up-to-date BigQuery table. - Swap the
internal_service_name
argument forinternal_sruid
argument toChild.__init__
and provide a valid SRUID - Instances of
Service
can no longer be given names. Please give them a valid SRUID instead. - To get the unraised exception from a failed answer returned by
Child.ask_multiple
, access the zeroth element e.g. if the third question failed:answers = Child.ask_multiple(*questions) exception, question_uuid = answers[3]
Service.received_events
,AbstractEventHandler.handled_events
, andChild.received_events
now include event attributes instead of just the event. These attributes/properties now return a list of dictionaries with the keys {"event", "attributes"}, where what was previously returned is now mapped to the "event" key.- Stop providing the
recipient
argument toEventReplayer
andGoogleCloudPubSubEventHandler
- it's now automatically acquired from each event's attributes - Stop passing the
skip_missing_events_after
argument toEventReplayer
andGoogleCloudPubSubEventHandler
- Stop using the
awaiting_missing_event
andtime_since_missing_event
properties on the event handlers