Apify API client for Python
Full documentation of the apify-client-python package version master, which simplifies access to the Apify API using Pythonapify_client is the official library to access the Apify API from your Python applications.
It provides useful features like automatic retries and convenience functions that improve the experience of using the Apify API.
Quick Start
Features
Automatic parsing and error handling
Retries with exponential backoff
Convenience functions and options
Usage concepts
Nested clients
Pagination
API Reference
InstallationRequires Python 3. 7+
You can install the client from its PyPI listing.
To do that, simply run pip install apify-client.
FeaturesBesides greatly simplifying the process of querying the Apify API, the client provides other useful features.
Automatic parsing and error handlingBased on the endpoint, the client automatically extracts the relevant data and returns it in the
expected format. Date strings are automatically converted to time objects. For exceptions,
we throw an ApifyApiError, which wraps the plain JSON errors returned by API and enriches
them with other context for easier debugging.
Retries with exponential backoffNetwork communication sometimes fails. The client will automatically retry requests that
failed due to a network error, an internal error of the Apify API (HTTP 500+) or rate limit error (HTTP 429).
By default, it will retry up to 8 times. First retry will be attempted after ~500ms, second after ~1000ms
and so on. You can configure those parameters using the max_retries and min_delay_between_retries_millis
options of the ApifyClient constructor.
Convenience functions and optionsSome actions can’t be performed by the API itself, such as indefinite waiting for an actor run to finish
(because of network timeouts). The client provides convenient call() and wait_for_finish() functions that do that.
Key-value store records can be retrieved as objects, buffers or streams via the respective options, dataset items
can be fetched as individual objects or serialized data and we plan to add better stream support and async iterators.
Usage conceptsThe ApifyClient interface follows a generic pattern that is applicable to all of its components.
By calling individual methods of ApifyClient, specific clients which target individual API
resources are created. There are two types of those clients. A client for management of a single
resource and a client for a collection of resources.
The ID of the resource can be either the id of the said resource,
or a combination of your username/resource-name.
This is really all you need to remember, because all resource clients
follow the pattern you see above.
Nested clientsSometimes clients return other clients. That’s to simplify working with
nested collections, such as runs of a given actor.
Most methods named list or list_something return a ListPage object,
containing properties items, total, offset, count and limit.
There are some exceptions though, like list_keys or list_head which paginate differently.
The results you’re looking for are always stored under items and you can use the limit
property to get only a subset of results. Other properties can be available depending on the method.
API ReferenceAll public classes, methods and their parameters can be inspected in this API reference.
ApifyClientThe Apify API client.
__init__()
actor()
actors()
build()
builds()
run()
runs()
dataset()
datasets()
key_value_store()
key_value_stores()
request_queue()
request_queues()
webhook()
webhooks()
webhook_dispatch()
webhook_dispatches()
schedule()
schedules()
log()
task()
tasks()
user()
ApifyClient. __init__(token=None, *, api_url=None, max_retries=8, min_delay_between_retries_millis=500)Initialize the Apify API Client.
Parameters
token (str, optional) – The Apify API token
api_url (str, optional) – The URL of the Apify API server to which to connect to. Defaults to
max_retries (int, optional) – How many times to retry a failed request at most
min_delay_between_retries_millis (int, optional) – How long will the client wait between retrying requests
(increases exponentially from this value)
(actor_id)Retrieve the sub-client for manipulating a single actor.
actor_id (str) – ID of the actor to be manipulated
Return type
ActorClient
()Retrieve the sub-client for manipulating actors.
ActorCollectionClient
(build_id)Retrieve the sub-client for manipulating a single actor build.
build_id (str) – ID of the actor build to be manipulated
BuildClient
()Retrieve the sub-client for querying multiple builds of a user.
BuildCollectionClient
(run_id)Retrieve the sub-client for manipulating a single actor run.
run_id (str) – ID of the actor run to be manipulated
RunClient
()Retrieve the sub-client for querying multiple actor runs of a user.
RunCollectionClient
set(dataset_id)Retrieve the sub-client for manipulating a single dataset.
dataset_id (str) – ID of the dataset to be manipulated
DatasetClient
sets()Retrieve the sub-client for manipulating datasets.
DatasetCollectionClient
y_value_store(key_value_store_id)Retrieve the sub-client for manipulating a single key-value store.
key_value_store_id (str) – ID of the key-value store to be manipulated
KeyValueStoreClient
y_value_stores()Retrieve the sub-client for manipulating key-value stores.
KeyValueStoreCollectionClient
quest_queue(request_queue_id, *, client_key=None)Retrieve the sub-client for manipulating a single request queue.
request_queue_id (str) – ID of the request queue to be manipulated
client_key (str) – A unique identifier of the client accessing the request queue
RequestQueueClient
quest_queues()Retrieve the sub-client for manipulating request queues.
RequestQueueCollectionClient
ApifyClient. webhook(webhook_id)Retrieve the sub-client for manipulating a single webhook.
webhook_id (str) – ID of the webhook to be manipulated
WebhookClient
ApifyClient. webhooks()Retrieve the sub-client for querying multiple webhooks of a user.
WebhookCollectionClient
ApifyClient. webhook_dispatch(webhook_dispatch_id)Retrieve the sub-client for accessing a single webhook dispatch.
webhook_dispatch_id (str) – ID of the webhook dispatch to access
WebhookDispatchClient
ApifyClient. webhook_dispatches()Retrieve the sub-client for querying multiple webhook dispatches of a user.
WebhookDispatchCollectionClient
hedule(schedule_id)Retrieve the sub-client for manipulating a single schedule.
schedule_id (str) – ID of the schedule to be manipulated
ScheduleClient
hedules()Retrieve the sub-client for manipulating schedules.
ScheduleCollectionClient
(build_or_run_id)Retrieve the sub-client for retrieving logs.
build_or_run_id (str) – ID of the actor build or run for which to access the log
LogClient
(task_id)Retrieve the sub-client for manipulating a single task.
task_id (str) – ID of the task to be manipulated
TaskClient
()Retrieve the sub-client for manipulating tasks.
TaskCollectionClient
(user_id=None)Retrieve the sub-client for querying users.
user_id (str, optional) – ID of user to be queried. If None, queries the user belonging to the token supplied to the client
UserClient
ActorClientSub-client for manipulating a single actor.
get()
update()
delete()
start()
call()
last_run()
versions()
version()
()Retrieve the actor.
Returns
The retrieved actor
dict, optional
(*, name=None, title=None, description=None, seo_title=None, seo_description=None, versions=None, restart_on_error=None, is_public=None, is_deprecated=None, is_anonymously_runnable=None, categories=None, default_run_build=None, default_run_memory_mbytes=None, default_run_timeout_secs=None, example_run_input_body=None, example_run_input_content_type=None)Update the actor with the specified fields.
name (str, optional) – The name of the actor
title (str, optional) – The title of the actor (human-readable)
description (str, optional) – The description for the actor
seo_title (str, optional) – The title of the actor optimized for search engines
seo_description (str, optional) – The description of the actor optimized for search engines
versions (list of dict, optional) – The list of actor versions
restart_on_error (bool, optional) – If true, the main actor run process will be restarted whenever it exits with a non-zero status code.
is_public (bool, optional) – Whether the actor is public.
is_deprecated (bool, optional) – Whether the actor is deprecated.
is_anonymously_runnable (bool, optional) – Whether the actor is anonymously runnable.
categories (list of str, optional) – The categories to which the actor belongs to.
default_run_build (str, optional) – Tag or number of the build that you want to run by default.
default_run_memory_mbytes (int, optional) – Default amount of memory allocated for the runs of this actor, in megabytes.
default_run_timeout_secs (int, optional) – Default timeout for the runs of this actor in seconds.
example_run_input_body (Any, optional) – Input to be prefilled as default input to new users of this actor.
example_run_input_content_type (str, optional) – The content type of the example run input.
The updated actor
dict
()Delete the actor.
None
(*, run_input=None, content_type=None, build=None, memory_mbytes=None, timeout_secs=None, wait_for_finish=None, webhooks=None)Start the actor and immediately return the Run object.
run_input (Any, optional) – The input to pass to the actor run.
content_type (str, optional) – The content type of the input.
build (str, optional) – Specifies the actor build to run. It can be either a build tag or build number.
By default, the run uses the build specified in the default run configuration for the actor (typically latest).
memory_mbytes (int, optional) – Memory limit for the run, in megabytes.
By default, the run uses a memory limit specified in the default run configuration for the actor.
timeout_secs (int, optional) – Optional timeout for the run, in seconds.
By default, the run uses timeout specified in the default run configuration for the actor.
wait_for_finish (int, optional) – The maximum number of seconds the server waits for the run to finish.
By default, it is 0, the maximum value is 300.
webhooks (list of dict, optional) – Optional ad-hoc webhooks ()
associated with the actor run which can be used to receive a notification,
e. g. when the actor finished or failed.
If you already have a webhook set up for the actor or task, you do not have to add it again here.
Each webhook is represented by a dictionary containing these items:
event_types: list of WebhookEventType values which trigger the webhook
request_url: URL to which to send the webhook HTTP request
payload_template (optional): Optional template for the request payload
The run object
(*, run_input=None, content_type=None, build=None, memory_mbytes=None, timeout_secs=None, webhooks=None, wait_secs=None)Start the actor and wait for it to finish before returning the Run object.
It waits indefinitely, unless the wait_secs argument is provided.
webhooks (list, optional) – Optional webhooks () associated with the actor run,
which can be used to receive a notification, e. when the actor finished or failed.
If you already have a webhook set up for the actor, you do not have to add it again here.
wait_secs (int, optional) – The maximum number of seconds the server waits for the run to finish. If not provided, waits indefinitely.
(*, version_number, beta_packages=None, tag=None, use_cache=None, wait_for_finish=None)Build the actor.
version_number (str) – Actor version number to be built.
beta_packages (bool, optional) – If True, then the actor is built with beta versions of Apify NPM packages.
By default, the build uses latest stable packages.
tag (str, optional) – Tag to be applied to the build on success. By default, the tag is taken from the actor version’s buildTag property.
use_cache (bool, optional) – If true, the actor’s Docker container will be rebuilt using layer cache
().
This is to enable quick rebuild during development.
By default, the cache is not used.
wait_for_finish (int, optional) – The maximum number of seconds the server waits for the build to finish before returning.
By default it is 0, the maximum value is 300.
The build object
()Retrieve a client for the builds of this actor.
()Retrieve a client for the runs of this actor.
st_run(*, status=None)Retrieve the client for the last run of this actor.
Last run is retrieved based on the start time of the runs.
status (ActorJobStatus, optional) – Consider only runs with this status.
The resource client for the last run of this actor.
rsions()Retrieve a client for the versions of this actor.
ActorVersionCollectionClient
rsion(version_number)Retrieve the client for the specified version of this actor.
version_number (str) – The version number for which to retrieve the resource client.
The resource client for the specified actor version.
ActorVersionClient
ActorClient. webhooks()Retrieve a client for webhooks associated with this actor.
ActorCollectionClientSub-client for manipulating actors.
list()
create()
(*, my=None, limit=None, offset=None, desc=None)List the actors the user has created or used.
my (bool, optional) – If True, will return only actors which the user has created themselves.
limit (int, optional) – How many actors to list
offset (int, optional) – What actor to include as first when retrieving the list
desc (bool, optional) – Whether to sort the actors in descending order based on their creation date
The list of available actors matching the specified filters.
ListPage
(*, name, title=None, description=None, seo_title=None, seo_description=None, versions=None, restart_on_error=None, is_public=None, is_deprecated=None, is_anonymously_runnable=None, categories=None, default_run_build=None, default_run_memory_mbytes=None, default_run_timeout_secs=None, example_run_input_body=None, example_run_input_content_type=None)Create a new actor.
name (str) – The name of the actor
The created actor.
ActorVersionClientSub-client for manipulating a single actor version.
()Return information about the actor version.
The retrieved actor version data
(*, build_tag=None, env_vars=None, apply_env_vars_to_build=None, source_type=None, source_code=None, base_docker_image=None, source_files=None, git_repo_url=None, tarball_url=None, github_gist_url=None)Update the actor version with specified fields.
build_tag (str, optional) – Tag that is automatically set to the latest successful build of the current version.
env_vars (list of dict, optional) – Environment variables that will be available to the actor run process,
and optionally also to the build process. See the API docs for their exact structure.
apply_env_vars_to_build (bool, optional) – Whether the environment variables specified for the actor run
will also be set to the actor build process.
source_type (ActorSourceType, optional) – What source type is the actor version using.
source_code (str, optional) – Source code as a single JavaScript/ file,
using the base Docker image specified in baseDockerImage.
Required when source_type is URCE_CODE.
base_docker_image (str, optional) – The base Docker image to use for single-file actors.
source_files (list of dict, optional) – Source code comprised of multiple files, each an item of the array.
Required when source_type is URCE_FILES. See the API docs for the exact structure.
git_repo_url (str, optional) – The URL of a Git repository from which the source code will be cloned.
Required when source_type is T_REPO.
tarball_url (str, optional) – The URL of a tarball or a zip archive from which the source code will be downloaded.
Required when source_type is ActorSourceType. TARBALL.
github_gist_url (str, optional) – The URL of a GitHub Gist from which the source will be downloaded.
Required when source_type is THUB_GIST.
The updated actor version
()Delete the actor version.
ActorVersionCollectionClientSub-client for manipulating actor versions.
()List the available actor versions.
The list of available actor versions.
(*, version_number, build_tag=None, env_vars=None, apply_env_vars_to_build=None, source_type, source_code=None, base_docker_image=None, source_files=None, git_repo_url=None, tarball_url=None, github_gist_url=None)Create a new actor version.
version_number (str) – Major and minor version of the actor (e. 1. 0)
source_type (ActorSourceType) – What source type is the actor version using.
The created actor version
RunClientSub-client for manipulating a single actor run.
abort()
wait_for_finish()
metamorph()
resurrect()
()Return information about the actor run.
The retrieved actor run data
(*, gracefully=None)Abort the actor run which is starting or currently running and return its details.
gracefully (bool, optional) – If True, the actor run will abort gracefully.
It will send aborting and persistStates events into the run and force-stop the run after 30 seconds.
It is helpful in cases where you plan to resurrect the run later.
The data of the aborted actor run
RunClient. wait_for_finish(*, wait_secs=None)Wait synchronously until the run finishes or the server times out.
wait_secs (int, optional) – how long does the client wait for run to finish. None for indefinite.
The actor run data. If the status on the object is not one of the terminal statuses
(SUCEEDED, FAILED, TIMED_OUT, ABORTED), then the run has not yet finished.
tamorph(*, target_actor_id, target_actor_build=None, run_input=None, content_type=None)Transform an actor run into a run of another actor with a new input.
target_actor_id (str) – ID of the target actor that the run should be transformed into
target_actor_build (str, optional) – The build of the target actor. It can be either a build tag or build number.
By default, the run uses the build specified in the default run configuration for the target actor (typically the latest build).
run_input (Any, optional) – The input to pass to the new run.
The actor run data.
surrect()Resurrect a finished actor run.
Only finished runs, i. e. runs with status FINISHED, FAILED, ABORTED and TIMED-OUT can be resurrected.
Run status will be updated to RUNNING and its container will be restarted with the same default storages.
set()Get the client for the default dataset of the actor run.
A client allowing access to the default dataset of this actor run.
y_value_store()Get the client for the default key-value store of the actor run.
A client allowing access to the default key-value store of this actor run.
quest_queue()Get the client for the default request queue of the actor run.
A client allowing access to the default request_queue of this actor run.
()Get the client for the log of the actor run.
A client allowing access to the log of this actor run.
RunCollectionClientSub-client for listing actor runs.
(*, limit=None, offset=None, desc=None, status=None)List all actor runs (either of a single actor, or all user’s actors, depending on where this client was initialized from).
limit (int, optional) – How many runs to retrieve
offset (int, optional) – What run to include as first when retrieving the list
desc (bool, optional) – Whether to sort the runs in descending order based on their start date
status (ActorJobStatus, optional) – Retrieve only runs with the provided status
The retrieved actor runs
BuildClientSub-client for manipulating a single actor build.
()Return information about the actor build.
The retrieved actor build data
()Abort the actor build which is starting or currently running and return its details.
The data of the aborted actor build
BuildClient. wait_for_finish(*, wait_secs=None)Wait synchronously until the build finishes or the server times out.
wait_secs (int, optional) – how long does the client wait for build to finish. None for indefinite.
The actor build data. If the status on the object is not one of the terminal statuses
(SUCEEDED, FAILED, TIMED_OUT, ABORTED), then the build has not yet finished.
BuildCollectionClientSub-client for listing actor builds.
(*, limit=None, offset=None, desc=None)List all actor builds (either of a single actor, or all user’s actors, depending on where this client was initialized from).
limit (int, optional) – How many builds to retrieve
offset (int, optional) – What build to include as first when retrieving the list
desc (bool, optional) – Whether to sort the builds in descending order based on their start date
The retrieved actor builds
DatasetClientSub-client for manipulating a single dataset.
list_items()
iterate_items()
download_items()
stream_items()
push_items()
()Retrieve the dataset.
The retrieved dataset, or None, if it does not exist
(*, name=None)Update the dataset with specified fields.
name (str, optional) – The new name for the dataset
The updated dataset
()Delete the dataset.
st_items(*, offset=None, limit=None, clean=None, desc=None, fields=None, omit=None, unwind=None, skip_empty=None, skip_hidden=None)List the items of the dataset.
offset (int, optional) – Number of items that should be skipped at the start. The default value is 0
limit (int, optional) – Maximum number of items to return. By default there is no limit.
desc (bool, optional) – By default, results are returned in the same order as they were stored.
To reverse the order, set this parameter to True.
clean (bool, optional) – If True, returns only non-empty items and skips hidden fields (i. fields starting with the # character).
The clean parameter is just a shortcut for skip_hidden=True and skip_empty=True parameters.
Note that since some objects might be skipped from the output, that the result might contain less items than the limit value.
fields (list of str, optional) – A list of fields which should be picked from the items,
only these fields will remain in the resulting record objects.
Note that the fields in the outputted items are sorted the same way as they are specified in the fields parameter.
You can use this feature to effectively fix the output format.
omit (list of str, optional) – A list of fields which should be omitted from the items.
unwind (str, optional) – Name of a field which should be unwound.
If the field is an array then every element of the array will become a separate record and merged with parent object.
If the unwound field is an object then it is merged with the parent object.
If the unwound field is missing or its value is neither an array nor an object and therefore cannot be merged with a parent object,
then the item gets preserved as it is. Note that the unwound items ignore the desc parameter.
skip_empty (bool, optional) – If True, then empty items are skipped from the output.
Note that if used, the results might contain less items than the limit value.
skip_hidden (bool, optional) – If True, then hidden fields are skipped from the output, i. fields starting with the # character.
A page of the list of dataset items according to the specified filters.
erate_items(*, offset=0, limit=None, clean=None, desc=None, fields=None, omit=None, unwind=None, skip_empty=None, skip_hidden=None)Iterate over the items in the dataset.
Yields
dict – An item from the dataset
Generator
wnload_items(*, item_format=’json’, offset=None, limit=None, desc=None, clean=None, bom=None, delimiter=None, fields=None, omit=None, unwind=None, skip_empty=None, skip_header_row=None, skip_hidden=None, xml_root=None, xml_row=None)Download the items in the dataset as raw bytes.
item_format (str) – Format of the results, possible values are: json, jsonl, csv, html, xlsx, xml and rss. The default value is json.
bom (bool, optional) – All text responses are encoded in UTF-8 encoding.
By default, csv files are prefixed with the UTF-8 Byte Order Mark (BOM),
while json, jsonl, xml, html and rss files are not. If you want to override this default behavior,
specify bom=True query parameter to include the BOM or bom=False to skip it.
delimiter (str, optional) – A delimiter character for CSV files. The default delimiter is a simple comma (, ).
skip_header_row (bool, optional) – If True, then header row in the csv format is skipped.
xml_root (str, optional) – Overrides default root element name of xml output. By default the root element is items.
xml_row (str, optional) – Overrides default element name that wraps each page or page function result object in xml output.
By default the element name is item.
The dataset items as raw bytes
bytes
ream_items(*, item_format=’json’, offset=None, limit=None, desc=None, clean=None, bom=None, delimiter=None, fields=None, omit=None, unwind=None, skip_empty=None, skip_header_row=None, skip_hidden=None, xml_root=None, xml_row=None)Retrieve the items in the dataset as a file-like object.
The dataset items as a file-like object
DatasetClient. push_items(items)Push items to the dataset.
items (Union[str, int, float, bool, None, Dict[str, Any], List[Any]]) – The items which to push in the dataset. Either a stringified JSON, a dictionary, or a list of strings or dictionaries.
DatasetCollectionClientSub-client for manipulating datasets.
get_or_create()
(*, unnamed=None, limit=None, offset=None, desc=None)List the available datasets.
unnamed (bool, optional) – Whether to include unnamed datasets in the list
limit (int, optional) – How many datasets to retrieve
offset (int, optional) – What dataset to include as first when retrieving the list
desc (bool, optional) – Whether to sort the datasets in descending order based on their modification date
The list of available datasets matching the specified filters.
t_or_create(*, name=None)Retrieve a named dataset, or create a new one when it doesn’t exist.
name (str, optional) – The name of the dataset to retrieve or create.
The retrieved or newly-created dataset.
KeyValueStoreClientSub-client for manipulating a single key-value store.
list_keys()
get_record()
set_record()
delete_record()
()Retrieve the key-value store.
The retrieved key-value store, or None if it does not exist
(*, name=None)Update the key-value store with specified fields.
name (str, optional) – The new name for key-value store
The updated key-value store
()Delete the key-value store.
st_keys(*, limit=None, exclusive_start_key=None)List the keys in the key-value store.
limit (int, optional) – Number of keys to be returned. Maximum value is 1000
exclusive_start_key (str, optional) – All keys up to this one (including) are skipped from the result
The list of keys in the key-value store matching the given arguments
t_record(key, *, as_bytes=False, as_file=False)Retrieve the given record from the key-value store.
key (str) – Key of the record to retrieve
as_bytes (bool, optional) – Whether to retrieve the record as unparsed bytes, default False
as_file (bool, optional) – Whether to retrieve the record as a file-like object, default False
The requested record, or None, if the record does not exist
t_record(key, value, content_type=None)Set a value to the given record in the key-value store.
key (str) – The key of the record to save the value to
value (Any) – The value to save into the record
content_type (str, optional) – The content type of the saved value
lete_record(key)Delete the specified record from the key-value store.
key (str) – The key of the record which to delete
KeyValueStoreCollectionClientSub-client for manipulating key-value stores.
(*, unnamed=None, limit=None, offset=None, desc=None)List the available key-value stores.
unnamed (bool, optional) – Whether to include unnamed key-value stores in the list
limit (int, optional) – How many key-value stores to retrieve
offset (int, optional) – What key-value store to include as first when retrieving the list
desc (bool, optional) – Whether to sort the key-value stores in descending order based on their modification date
The list of available key-value stores matching the specified filters.
t_or_create(*, name=None)Retrieve a named key-value store, or create a new one when it doesn’t exist.
name (str, optional) – The name of the key-value store to retrieve or create.
The retrieved or newly-created key-value store.
RequestQueueClientSub-client for manipulating a single request queue.
list_head()
add_request()
get_request()
update_request()
delete_request()
()Retrieve the request queue.
The retrieved request queue, or None, if it does not exist
(*, name=None)Update the request queue with specified fields.
name (str, optional) – The new name for the request queue
The updated request queue
()Delete the request queue.
st_head(*, limit=None)Retrieve a given number of requests from the beginning of the queue.
limit (int, optional) – How many requests to retrieve
The desired number of requests from the beginning of the queue.
d_request(request, *, forefront=None)Add a request to the queue.
request (dict) – The request to add to the queue
forefront (bool, optional) – Whether to add the request to the head or the end of the queue
The added request.
t_request(request_id)Retrieve a request from the queue.
request_id (str) – ID of the request to retrieve
The retrieved request, or None, if it did not exist.
RequestQueueClient. update_request(request, *, forefront=None)Update a request in the queue.
request (dict) – The updated request
forefront (bool, optional) – Whether to put the updated request in the beginning or the end of the queue
The updated request
lete_request(request_id)Delete a request from the queue.
request_id (str) – ID of the request to delete.
RequestQueueCollectionClientSub-client for manipulating request queues.
(*, unnamed=None, limit=None, offset=None, desc=None)List the available request queues.
unnamed (bool, optional) – Whether to include unnamed request queues in the list
limit (int, optional) – How many request queues to retrieve
offset (int, optional) – What request queue to include as first when retrieving the list
desc (bool, optional) – Whether to sort therequest queues in descending order based on their modification date
The list of available request queues matching the specified filters.
t_or_create(*, name=None)Retrieve a named request queue, or create a new one when it doesn’t exist.
name (str, optional) – The name of the request queue to retrieve or create.
The retrieved or newly-created request queue.
LogClientSub-client for manipulating logs.
stream()
()Retrieve the log as text.
The retrieved log, or None, if it does not exist.
str, optional
()Retrieve the log as a file-like object.
The retrieved log as a file-like object, or None, if it does not exist.
Return type, optional
WebhookClientSub-client for manipulating a single webhook.
test()
dispatches()
()Retrieve the webhook.
The retrieved webhook, or None if it does not exist
(*, event_types=None, request_url=None, payload_template=None, actor_id=None, actor_task_id=None, actor_run_id=None, ignore_ssl_errors=None, do_not_retry=None, is_ad_hoc=None)Update the webhook.
event_types (list of WebhookEventType, optional) – List of event types that should trigger the webhook. At least one is required.
request_url (str, optional) – URL that will be invoked once the webhook is triggered.
payload_template (str, optional) – Specification of the payload that will be sent to request_url
actor_id (str, optional) – Id of the actor whose runs should trigger the webhook.
actor_task_id (str, optional) – Id of the actor task whose runs should trigger the webhook.
actor_run_id (str, optional) – Id of the actor run which should trigger the webhook.
ignore_ssl_errors (bool, optional) – Whether the webhook should ignore SSL errors returned by request_url
do_not_retry (bool, optional) – Whether the webhook should retry sending the payload to request_url upon
failure.
is_ad_hoc (bool, optional) – Set to True if you want the webhook to be triggered only the first time the
condition is fulfilled. Only applicable when actor_run_id is filled.
The updated webhook
()Delete the webhook.
()Test a webhook.
Creates a webhook dispatch with a dummy payload.
The webhook dispatch created by the test
WebhookClient. dispatches()Get dispatches of the webhook.
A client allowing access to dispatches of this webhook using its list method
WebhookCollectionClientSub-client for manipulating webhooks.
(*, limit=None, offset=None, desc=None)List the available webhooks.
limit (int, optional) – How many webhooks to retrieve
offset (int, optional) – What webhook to include as first when retrieving the list
desc (bool, optional) – Whether to sort the webhooks in descending order based on their date of crea
bn1/python-apify: Python client for Apify API. – GitHub
Apify is package with Python bindings for – a plaftorm for creating web crawlers.
Install
Apify is available for python 2. 7 and above
Examples
Getting all stored records from dummy-store:
from apify import KeyValueStore
store = KeyValueStore(‘dummy-store’)
for record in store:
print record
Getting only record pages 100, 101,…, 198, 199 from dummy-store:
= 99 # starts with fetching the next (100th) page
if == 200:
break
Contribute
Check for open issues or open a fresh issue to start a discussion around a feature idea or a bug.
Fork the repository on GitHub to start making your changes to the master branch (or branch off of it).
Write a test which shows that the bug was fixed or that the feature works as expected.
Send a pull request and bug the maintainer until it gets merged and published. 🙂
apify-client – PyPI
This is an official client for the Apify API.
It’s still a work in progress, so please don’t use it yet in production environments!
Installation
Requires Python 3. 7+
You can install the client from its PyPI listing.
To do that, simply run pip install apify-client in your terminal.
Usage
For usage instructions, check the documentation on Apify Docs or in docs/
Development
Environment
For local development, it is required to have Python 3. 7 installed.
It is recommended to set up a virtual environment while developing this package to isolate your development environment,
however, due to the many varied ways Python can be installed and virtual environments can be set up,
this is left up to the developers to do themselves.
One recommended way is with the builtin venv module:
python3 -m venv
source
To improve on the experience, you can use pyenv to have an environment with a pinned Python version,
and direnv to automatically activate/deactivate the environment when you enter/exit the project folder.
Dependencies
To install this package and its development dependencies, run pip install -e ‘. [dev]’
Formatting
We use autopep8 and isort to automatically format the code to a common format. To run the formatting, just run. /
Linting and Testing
We use flake8 for linting, mypy for type checking and pytest for unit testing. To run these tools, just run. /
Documentation
We use the Google docstring format for documenting the code.
We document every user-facing class or method, and enforce that using the flake8-docstrings library.
The documentation is then rendered from the docstrings in the code using Sphinx and some heavy post-processing and saved as docs/
To generate the documentation, just run. /
Release process
Publishing new versions to PyPI happens automatically through GitHub Actions.
On each commit to the master branch, a new beta release is published, taking the version number from src/apify_client/
and automatically incrementing the beta version suffix by 1 from the last beta release published to PyPI.
A stable version is published when a new release is created using GitHub Releases, again taking the version number from src/apify_client/ The built package assets are automatically uploaded to the GitHub release.
If there is already a stable version with the same version number as in src/apify_client/ published to PyPI, the publish process fails,
so don’t forget to update the version number before releasing a new version.
The release process also fails when the released version is not described in,
so don’t forget to describe the changes in the new version there.