2019 |
Perceptual Compression for Video Storage and Processing Systems.
Proceedings of the ACM Symposium on Cloud Computing (SoCC '19). Best Poster Award! |
Visual Road: A Video Data Management Benchmark.
Proceedings of the 2019 International Conference on Management of Data (SIGMOD '19). |
|
Synthesizing number generators for stochastic computing using mixed integer programming.
CoRR. |
|
2018 |
Architecture Considerations for Stochastic Computing Accelerators.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. |
Automating Generation of Low Precision Deep Learning Operators.
CoRR. |
|
Parameter Hub: A Rack-Scale Parameter Server for Distributed Deep Neural Network Training.
Proceedings of the ACM Symposium on Cloud Computing (SoCC '18). |
|
Stochastic Synthesis for Stochastic Computing.
CoRR. |
|
LightDB: A DBMS for Virtual Reality Video.
Proc. VLDB Endow.. |
|
Application Codesign of Near-Data Processing for Similarity Search.
2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS). |
|
A Taxonomy of General Purpose Approximate Computing Techniques.
IEEE Embedded Systems Letters. |
|
Iterative Search for Reconfigurable Accelerator Blocks With a Compiler in the Loop.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. |
|
2017 |
Exploiting Quality-energy Tradeoffs with Arbitrary Quantization: Special Session Paper.
Proceedings of the Twelfth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis Companion (CODES '17). |
Exploring computation-communication tradeoffs in camera systems.
IEEE International Symposium on Workload Characterization (IISWC). |
|
Customizing Progressive JPEG for Efficient Image Storage.
USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage). |
|
A Hardware-Friendly Bilateral Solver for Real-Time Virtual Reality Video.
High-Performance Graphics (HPG). |
|
Solver Aided Reverse Engineering of Architectural Features.
Workshop on Duplicating, Deconstructing and Debunking (WDDD w/ ISCA). |
|
Similarity Search on Automata Processors.
IEEE International Parallel & Distributed Processing Symposium (IPDPS). |
|
Profiling a GPU database implementation: a holistic view of GPU resource utilization on TPC-H queries.
International Workshop on Data Management on New Hardware (DAMON w/ SIGMOD). |
|
VisualCloud Demonstration: A DBMS for Virtual Reality.
ACM International Conference on Management of Data (SIGMOD). |
|
Augmenting Interpersonal Communication through Connected Light.
ACM Conference on Human Factors in Computing Systems (CHI) Extended Abstracts. |
|
Approximate Storage of Compressed and Encrypted Videos.
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). |
|
IncBricks: Toward In-Network Computation with an In-Network Cache.
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). |
|
Energy-Efficient Hybrid Stochastic-Binary Neural Networks for Near-Sensor Computing.
Design, Automation & Test in Europe (DATE). |
|
2016 |
Disciplined Inconsistency with Consistency Types.
ACM Symposium on Cloud Computing (SoCC). |
An evaluation of contemporary heterogeneous computing platforms for data intensive applications.
Workshop on Efficient Data Center Systems (EDCS w/ ISCA). |
|
Specifying and Checking File System Crash-Consistency Models.
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). |
|
A DNA-Based Archival Storage System.
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). |
|
High-Density Image Storage Using Approximate Memory Cells.
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). |
|
Optimizing Synthesis with Metasketches.
Symposium on Principles of Programming Languages (POPL). |
|
2015 |
Probability Type Inference for Flexible Approximate Programming.
OOPSLA. |
Latency-Tolerant Software Distributed Shared Memory.
2015 USENIX Annual Technical Conference (USENIX ATC 15). |
|
Approximate Program Synthesis.
Workshop on Approximate Computing Across the Stack (WAX w/ PLDI). |
|
Hardware-Software Co-Design: Not Just a Cliche.
Summit on Advances in Programming Languages (SNAPL). |
|
Approximate Computing: Making Mobile Systems More Efficient.
Pervasive Computing, IEEE. |
|
Claret: Using Data Types for Highly Concurrent Distributed Transactions.
Workshop on Principles and Practice of Consistency (PaPoC'15 w/ EuroSys). |
|
Monitoring and Debugging the Quality of Results in Approximate Programs.
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). |
|
Data Provenance Tracking for Concurrent Programs.
International Symposium on Code Generation and Optimization (CGO). |
|
SNNAP: Approximate Computing on Programmable SoCs via Neural Acceleration.
IEEE Symp. on High Performance Computer Architecture (HPCA). |
|
2014 |
Compiling Efficient Query Plans for Distributed Shared Memory.
Technical Report UW-CSE-14-10-01, University of Washington. |
Alembic: Automatic Locality Extraction via Migration.
SPLASH-OOPSLA. |
|
Expressing and Verifying Probabilistic Assertions.
Conference on Programming Language Design and Implementation (PLDI). |
|
Nonvolatile Memory is a Broken Time Machine.
ACM SIGPLAN Workshop on Memory Systems Performance and Correctness (MSPC w/ PLDI). |
|
Low-Level Detection of Language-Level Data Races with LARD.
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). |
|
Mercury: An Integrated, 3D-Stacked Server Design for Increasing Physical Density of Key-Value Stores.
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). |
|
Grappa: A Latency-Tolerant Runtime for Large-Scale Irregular Applications.
Technical Report UW-CSE-14-02-01, University of Washington. |
|
2013 |
Approximate Storage in Solid-State Memories.
International Symposium on Microarchitecture (MICRO). |
EnerJ, the Language of Good-Enough Computing.
IEEE Spectrum Feature Article. |
|
Input-Covering Schedules for Multithreaded Programs.
SPLASH-OOPSLA. |
|
Flat Combining Synchronized Global Data Structures.
International Conference on PGAS Programming Models (PGAS). |
|
Compiled Plans for In-Memory Path-Counting Queries.
International Workshop on In-Memory Data Management and Analytics (IMDM w/ VLDB). |
|
DNA-based Molecular Architecture with Spatially Localized Components.
International Symposium on Computer Architecture (ISCA). |
|
Pomace: A Grappa for Non-Volatile Memory.
Non-Volatile Memories Workshop (NVMW). |
|
DDOS: Taming Nondeterminism in Distributed Systems.
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). |
|
Cooperative Empirical Failure Avoidance for Multithreaded Programs.
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). |
|
Input-Covering Schedules for Multithreaded Programs.
Workshop on Determinism and Correctness in Parallel Programming w/ International Conference on Architectural Support for Programming Languages and Operating Systems (WoDet w/ ASPLOS). |
|
2012 |
Neural Acceleration for General-Purpose Approximate Programs.
International Symposium on Microarchitecture (MICRO). Selected for IEEE Micro Top Picks 2012. |
Automatic Discovery of Performance and Energy Pitfalls in HTML and CSS.
International Symposium on Workload Characterization (IISWC). |
|
IFRit: Interference-Free Regions for Dynamic Data-Race Detection.
SPLASH-OOPSLA. |
|
Do we need a crystal ball for task migration?.
USENIX Hot Topics in Parallelism (HotPar). |
|
Addressing Dark Silicon Challenges with Disciplined Approximate Computing.
Dark Silicon Workshop w/ International Symposium on Computer Architecture (DaSi w/ ASPLOS). |
|
Towards Neural Acceleration for General-Purpose Approximate Computing.
Workshop on Energy Efficient Design w/ International Symposium on Computer Architecture (WEED w/ ISCA). |
|
RADISH: Always-On Sound and Complete RAce Detection In Software and Hardware.
International Symposium on Computer Architecture (ISCA). |
|
Automatic Empirical Failure Avoidance for Concurrent Software.
Workshop on Determinism and Correctness in Parallel Programming w/ International Conference on Architectural Support for Programming Languages and Operating Systems (WoDet w/ ASPLOS). |
|
The Case For Merging Execution- and Language-level Determinism with MELD.
Workshop on Determinism and Correctness in Parallel Programming w/ International Conference on Architectural Support for Programming Languages and Operating Systems (WoDet w/ ASPLOS). |
|
Architecture Support for Disciplined Approximate Programming.
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). |
|
2011 |
Accelerating Data Race Detection with Minimal Hardware Support.
EuroPar. |
Data-Race Exceptions Have Benefits Beyond the Memory Model.
Workshop on Memory System Performance and Correctness w/ Conference on Programming Language Design and Implementation (MSPC w/ PLDI). |
|
On the Impact of Memory Models on Software Reliability in Multiprocessors.
Symposium on Principles of Distributed Computing (PODC).. |
|
Crunching Large Graphs with Commodity Processors.
USENIX Hot Topics on Parallelism (HotPar). |
|
EnerJ: Approximate Data Types for Safe and General Low-Power Computation.
Conference on Programming Language Design and Implementation (PLDI). |
|
Isolating and Understanding Concurrency Errors Using Reconstructed Execution Fragments.
Conference on Programming Language Design and Implementation (PLDI). |
|
Operating System Implications of Fast, Cheap, Non-Volatile Memory.
USENIX Hot Topics on Operating Systems (HotOS). |
|
Dense Approximate Storage in Phase-Change Memory.
Wild and Crazy Ideas w/ International Conference on Architectural Support for Programming Languages and Operating Systems (WACI w/ ASPLOS). |
|
The Deterministic Execution Hammer: How Well Does it Actually Pound Nails?.
Workshop on Determinism and Correctness in Parallel Programming w/ International Conference on Architectural Support for Programming Languages and Operating Systems (WoDet w/ ASPLOS). |
|
RCDC: A Relaxed Consistency Deterministic Computer.
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). |
|
Characterizing the Performance and Energy Efficiency of Lock-Free Data Structures.
Workshop on Interaction between Compilers and Computer Architectures w/ International Symposium on High-Performance Computer Architecture (INTERACT w/ HPCA). |
|
Checked Load: Architectural Support for JavaScript Type-Checking on Mobile Processors.
International Symposium on High-Performance Computer Architecture (HPCA). |
|
System Introspection with Hardware Watchmachines.
Fun Ideas and Thoughts w/ Conference on Programming Language Design and Implementation (PLDI FIT). |
|
2010 |
A Limit Study of JavaScript Parallelism.
International Symposium on Workload Characterization (IISWC). |
Deterministic Process Groups in dOS.
Symposium on Operating Systems Design and Implementation (OSDI). |
|
Composable Specifications for Structured Shared-Memory Communication.
SPLASH-OOPSLA. |
|
Conflict Exceptions: Providing Simple Concurrent Language Semantics with Precise Hardware Exceptions for Data Races.
International Symposium on Computer Architecture (ISCA). |
|
ColorSafe: Architectural Support for Debugging and Dynamically Avoiding Multi-variable Atomicity Violations.
International Symposium on Computer Architecture (ISCA). |
|
Lock Prediction.
USENIX Hot Topics on Parallelism (HotPar). |
|
CoreDet: A Compiler and Runtime System for Deterministic Multithreaded Execution.
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). |
|
DMP: Deterministic Shared Memory Multiprocessing.
IEEE Micro Top Picks in Computer Architecture. |
|
Shared-Memory Multiprocessors.
Chapter in Encyclopedia of Parallel Computing, Editor: David Padua. |
|
2009 |
Finding Concurrency Bugs with Context-Aware Communication Graphs.
International Symposium on Microarchitecture (MICRO). |
The Bulk Multicore Architecture for Improved Programmability.
Communication of the ACM. |
|
Concurrency Discovery for Very Large Windows of Execution.
Workshop on Parallel Execution of Sequential Programs on Multi-core Architectures w/ International Symposium on Computer Architecture (PESPMA w/ ISCA). |
|
Two Hardware-based Approaches for Deterministic Multiprocessor Replay.
Research Highlights, Communication of the ACM. |
|
The Case for System Support for Concurrency Exceptions.
USENIX Hot Topics on Parallelism (HotPar). |
|
DMP: Deterministic Shared Memory Multiprocessing.
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Selected for IEEE Micro Top Picks 2009. |
|
Self-Powered Processors.
Wild and Crazy Ideas w/ International Conference on Architectural Support for Programming Languages and Operating Systems (WACI w/ ASPLOS). |
|
Atom-Aid: Detecting and Surviving Atomicity Violations.
IEEE Micro Top Picks in Computer Architecture. |
|
SoftSig: Software-Exposed Hardware Signatures for Memory Disambiguation.
IEEE Micro Top Picks in Computer Architecture. |
|
Programming and Debugging Shared Memory Programs with Data Coloring.
Workshop on Compilers for Parallel Computing (CPC). |
|
Using Checkpoint-Assisted Value Prediction to Hide L2 Misses.
ACM Transactions on Architecture and Code Optimization (TACO). |
|
2008 |
Atom-Aid: Detecting and Surviving Atomicity Violations.
International Symposium on Computer Architecture (ISCA). Selected for IEEE Micro Top Picks 2008. |
DeLorean: Recording and Deterministically Replaying Shared-Memory Multiprocessor Execution Efficiently.
International Symposium on Computer Architecture (ISCA). |
|
Explicitly Parallel Programming with Shared-Memory is Insane: At Least Make it Deterministic!.
Workshop on Software and Hardware Challenges of Manycore Platforms w/ International Symposium on Computer Architecture (SHCMP w/ ISCA). |
|
SoftSig: Software-Exposed Hardware Signatures for Memory Disambiguation.
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Selected for IEEE Micro Top Picks 2008. |
|
Concurrency Control with Data Coloring.
Workshop on Memory Systems Performance and Correctness w/ International Conference on Architectural Support for Programming Languages and Operating Systems (MSPC w/ ASPLOS). |
|
2007 |
BulkSC: Bulk Enforcement of Sequential Consistency.
International Symposium on Computer Architecture (ISCA). |
Implicit Parallelism with Ordered Transactions.
Principles and Practice of Parallel Programming (PPoPP). |
|
Colorama: Architectural Support for Data-Centric Synchronization.
International Symposium on High-Performance Computer Architecture (HPCA). |
|
2006 |
Scalable Cache Miss Handling for High Memory Level Parallelism.
International Symposium on Microarchitecture (MICRO). |
Bulk Disambiguation of Speculative Threads in Multiprocessors.
International Symposium on Computer Architecture (ISCA). |
|
POSH: A TLS Compiler that Exploits Program Structure.
Principles and Practice of Parallel Programming (PPoPP). |
|
Are We Ready for High Memory-Level Parallelism?.
Workshop on Memory Performance Issues w/ International Symposium on High-Performance Computer Architecture (WMPI w/HPCA). Also appears in SIGMICRO Newsletter selection from WMPI-2006. |
|
Energy-Efficient Thread-Level Speculation on a CMP.
IEEE Micro Top Picks in Computer Architecture. |
|
2005 |
Thread-Level Speculation on a CMP Can Be Energy Efficient.
International Conference on Supercomputing (ICS). Selected for IEEE Micro Top Picks 2005. |
Tasking with Out-of-Order Spawn in TLS Chip Multiprocessors: Microarchitecture and Compilation.
International Conference on Supercomputing (ICS). Selected for IEEE Micro Top Picks 2005. |
|
2004 |
CAVA: Hiding L2 Misses with Checkpoint-Assisted Value Prediction.
IEEE Computer Architecture Letters (CAL). |
2003 |
An Overview Of The Blue Gene/L System Software Organization.
Parallel Processing Letters. |
An Overview Of The Blue Gene/L System Software Organization.
International Conference on Parallel and Distributed Computing (Euro-Par). |
|
Full Circle: Simulating Linux Clusters on Linux Clusters.
LCI International Conference on Linux Clusters (CWCE). Selected as one of the top 3 papers in the conference. |
|
2002 |
Blue Gene/L, a system-on-a-chip.
IEEE International Conference on Cluster Computing (CC). |
An Overview of the Blue Gene/L Supercomputer.
IEEE Supercomputing (SC). |
|
Evaluation of a Multithreaded Architecture for Cellular Computing.
International Symposium on High-Performance Computer Architecture (HPCA). |
|
Cellular Supercomputing with System-on-a-Chip.
International Solid State Circuits Conference (ISSCC). |
|
2000 |
An environment for easy cross synchronization of multimedia Web based material.
Frontiers in Education. |