Skip to content

question: Pyspark Processing Jobs in Local Mode? #19

@dcompgriff

Description

@dcompgriff

Hello. I was wondering if there existed a tutorial, or current support for 1) running a pyspark processing job locally and 2) doing so with a custom base docker (EMR) image? I see a tutorial for Dask using a script processor, and also some code for an SKLearn based processor. My goal is to be able to basically set up a local testing/dev environment that uses sagemaker spark processor code. I'm guessing this is more complicated than the other use cases since this processor is usually backed by an EMR cluster.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions