Quantcast
Channel: Working notes by Matthew Rocklin - SciPy
Browsing all 100 articles
Browse latest View live

Fast Message Serialization

This work is supported by Continuum Analytics and the XDATA Program as part of the Blaze ProjectVery high performance isn’t about doing one thing well, it’s about doing nothing poorly.This week I...

View Article


Ad Hoc Distributed Random Forests

This work is supported by Continuum Analytics and the XDATA Program as part of the Blaze ProjectA screencast version of this post is available here:...

View Article


Data Bandwidth

This work is supported by Continuum Analytics and the XDATA Program as part of the Blaze Projecttl;dr: We list and combine common bandwidths relevant in data scienceUnderstanding data bandwidths helps...

View Article

Image may be NSFW.
Clik here to view.

Disk Bandwidth

This work is supported by Continuum Analytics and the XDATA Program as part of the Blaze Projecttl;dr: Disk read and write bandwidths depend strongly on block size.Disk read/write bandwidths on...

View Article

Introducing Dask distributed

This work is supported by Continuum Analytics and the XDATA Program as part of the Blaze Projecttl;dr: We analyze JSON data on a cluster using pure Python projects.Dask, a Python library for parallel...

View Article


Image may be NSFW.
Clik here to view.

Pandas on HDFS with Dask Dataframes

This work is supported by Continuum Analytics and the XDATA Program as part of the Blaze ProjectIn this post we use Pandas in parallel across an HDFS cluster to read CSV data. We coordinate these...

View Article

Image may be NSFW.
Clik here to view.

Distributed Dask Arrays

This work is supported by Continuum Analytics and the XDATA Program as part of the Blaze ProjectIn this post we analyze weather data across a cluster using NumPy in parallel with dask.array. We focus...

View Article

Fast Message Serialization

This work is supported by Continuum Analytics and the XDATA Program as part of the Blaze ProjectVery high performance isn’t about doing one thing well, it’s about doing nothing poorly.This week I...

View Article


Ad Hoc Distributed Random Forests

This work is supported by Continuum Analytics and the XDATA Program as part of the Blaze ProjectA screencast version of this post is available here:...

View Article


Image may be NSFW.
Clik here to view.

Dask for Institutions

This work is supported by Continuum AnalyticsIntroductionInstitutions use software differently than individuals. Over the last few months I’ve had dozens of conversations about using Dask within larger...

View Article

Supporting Users in Open Source

What are the social expectations of open source developers to help users understand their projects? What are the social expectations of users when asking for help?As part of developing Dask, an open...

View Article

Image may be NSFW.
Clik here to view.

Dask Distributed Release 1.13.0

I’m pleased to announce a release of Dask’s distributed scheduler, dask.distributed, version 1.13.0.conda install dask distributed -c conda-forge or pip install dask distributed --upgrade The last few...

View Article

Where to Write Prose?

Code is only as good as its prose.Like many programmers I spend more time writing prose than code. This is great; writing clean prose focuses my thoughts during design and disseminates understanding so...

View Article


Dask and Celery

This post compares two Python distributed task processing systems, Dask.distributed and Celery.Disclaimer: technical comparisons are hard to do well. I am biased towards Dask and ignorant of correct...

View Article

Dask Cluster Deployments

This work is supported by Continuum Analytics and the XDATA Program as part of the Blaze ProjectAll code in this post is experimental. It should not be relied upon. For people looking to deploy...

View Article


Image may be NSFW.
Clik here to view.

Dask Development Log

This work is supported by Continuum Analytics the XDATA Program and the Data Driven Discovery Initiative from the Moore FoundationDask has been active lately due to a combination of increased adoption...

View Article

Image may be NSFW.
Clik here to view.

Dask Development Log

This work is supported by Continuum Analytics the XDATA Program and the Data Driven Discovery Initiative from the Moore FoundationTo increase transparency I’m blogging weekly about the work done on...

View Article


Dask Development Log

This work is supported by Continuum Analytics the XDATA Program and the Data Driven Discovery Initiative from the Moore FoundationTo increase transparency I’m blogging weekly about the work done on...

View Article

Image may be NSFW.
Clik here to view.

Dask Development Log

This work is supported by Continuum Analytics the XDATA Program and the Data Driven Discovery Initiative from the Moore FoundationTo increase transparency I’m blogging weekly about the work done on...

View Article

Image may be NSFW.
Clik here to view.

Dask Release 0.13.0

This work is supported by Continuum Analytics the XDATA Program and the Data Driven Discovery Initiative from the Moore FoundationSummaryDask just grew to version 0.13.0. This is a signifcant release...

View Article
Browsing all 100 articles
Browse latest View live