Dask Benchmarks
This work is supported by Continuum Analytics and the Data Driven Discovery Initiative from the Moore Foundation.SummaryWe measure the performance of Dask’s distributed scheduler for a variety of...
View ArticleScikit-Image and Dask Performance
This weekend at the SciPy 2017 sprints I worked alongside Scikit-image developers to investigate parallelizing scikit-image with Dask.Here is a notebook of our work.
View ArticleDask Release 0.15.2
This work is supported by Anaconda Inc. and the Data Driven Discovery Initiative from the Moore Foundation.I’m pleased to announce the release of Dask version 0.15.2. This release contains stability...
View ArticleDask Release 0.15.3
This work is supported by Anaconda Inc. and the Data Driven Discovery Initiative from the Moore Foundation.I’m pleased to announce the release of Dask version 0.15.3. This release contains stability...
View ArticleNotes on Kafka in Python
SummaryI recently investigated the state of Python libraries for Kafka. This blogpost contains my findings.Both PyKafka and confluent-kafka have mature implementations and are maintained by invested...
View ArticleStreaming Dataframes
This work is supported by Anaconda Inc and the Data Driven Discovery Initiative from the Moore FoundationThis post is about experimental software. This is not ready for public use. All code examples...
View ArticleOptimizing Data Structure Access in Python
This work is supported by Anaconda Inc and the Data Driven Discovery Initiative from the Moore FoundationLast week at PyCon DE I had the good fortune to meet Stefan Behnel, one of the core developers...
View ArticleDask Release 0.16.0
This work is supported by Anaconda Inc. and the Data Driven Discovery Initiative from the Moore Foundation.I’m pleased to announce the release of Dask version 0.16.0. This is a major release with new...
View ArticleDask Development Log
This work is supported by Anaconda Inc and the Data Driven Discovery Initiative from the Moore FoundationTo increase transparency I’m trying to blog more often about the current work going on around...
View ArticlePangeo: JupyterHub, Dask, and XArray on the Cloud
This work is supported by Anaconda Inc, the NSF EarthCube program, and UC Berkeley BIDSA few weeks ago a few of us stood up pangeo.pydata.org, an experimental deployment of JupyterHub, Dask, and XArray...
View ArticleWrite Dumb Code
The best way you can contribute to an open source project is to remove lines of code from it.We should endeavor to write code that a novice programmer can easily understand without explanation or that...
View ArticleThe Case for Numba in Community Code
The numeric Python community should consider adopting Numba more widely within community code.Numba is strong in performance and usability, but historically weak in ease of installation and community...
View ArticleHDF in the Cloud
Multi-dimensional data, such as is commonly stored in HDF and NetCDF formats, is difficult to access on traditional cloud storage platforms. This post outlines the situation, the following possible...
View ArticleDask Release 0.17.0
This work is supported by Anaconda Inc. and the Data Driven Discovery Initiative from the Moore Foundation.I’m pleased to announce the release of Dask version 0.17.0. This a significant major release...
View ArticleCraft Minimal Bug Reports
Following up on a post on supporting users in open source this post lists some suggestions on how to ask a maintainer to help you with a problem.You don’t have to follow these suggestions. They are...
View ArticleSummer Student Projects 2018
Around this time of year students look for Summer projects. Often they get internships at potential future employers. Sometimes they become more engaged in open source software.This blogpost contains...
View ArticleDask Release 0.17.2
This work is supported by Anaconda Inc. and the Data Driven Discovery Initiative from the Moore Foundation.I’m pleased to announce the release of Dask version 0.17.2. This is a minor release with new...
View ArticleBeyond Numpy Arrays in Python
Executive SummaryIn recent years Python’s array computing ecosystem has grown organically to support GPUs, sparse, and distributed arrays. This is wonderful and a great example of the growth that can...
View ArticleDask Release 0.18.0
This work is supported by Anaconda Inc.I’m pleased to announce the release of Dask version 0.18.0. This is a major release with breaking changes and new features. The last release was 0.17.5 on May...
View ArticleDask Release 0.18.0
This work is supported by Anaconda Inc.I’m pleased to announce the release of Dask version 0.18.0. This is a major release with breaking changes and new features. The last release was 0.17.5 on May...
View Article