3. The Consequence of Specialization in Data Systems
Data Consistency is critical !!!
Data Flow is essential
4. Extract changes from
database commit log
Tough but possible
Consistent!!!
Application code dual
writes to database and
pub-sub system
Easy on the surface
Consistent?
Two Ways
5. Change Extract: Databus
5
Primary
Data Store
Data Change Events
Standar
dization
Standar
dization
Standar
dization
Standar
dization
Standar
dization
Search
Index
Standar
dization
Standar
dization
Graph
Index
Standar
dization
Standar
dization
Read
Replicas
Updates
Databus
6. Example: External Indexes
Description
– Full-text and faceted search
over profile data
Requirements
– Timeline consistency
– Guaranteed delivery
– Low latency
– User-space visibility
6
Members
Update
skills
Recruiters
Search
Results
Change events
linkedin.com recruiter.linkedin.com
People
Search Index
Databus
7. A brief history of Databus
2006-2010 : Databus became an established and vital
piece of infrastructure for consistent data flow from
Oracle
2011 : Databus (V2) addressed scalability and operability
issues
2012 : Databus supported change capture from Espresso
2013 : Open Source Databus
– https://github.com/linkedin/databus
8. Databus Eco-system: Participants
Primary
Data Store
Source Databus
Consumer
Application
Change
Data
Capture
Change Event
Stream
events
events
change
data
• Support
transactions
• Extract changed
data of committed
transactions
• Transform to ‘user-
space’ events
• Preserve atomicity
• Receive change
events quickly
• Preserve
consistency with
source
9. Databus Eco-System : Realities
Databases
Source Databus
Fast
Consumer
Applications
Change
Data
Capture
Change Event
Stream
Slow
Consumer
New
Consumer
Every
change
Changes
since last
week
Changes
since last 5
seconds
Schema
s evolve
• Source cannot be burdened by ‘long look back’
extracts
• Applications cannot be forced to move to
latest version of schema at once
change
data
events
10. Key Design Decisions : Semantics
Change Data Capture uses logical clocks attached to the
source (SCN)
– Change data stream is ordered by SCN
– Simplifies data portability , change stream is f(SourceState,SCN)
Applications are idempotent
– At least once delivery
– Track progress reliably (SCN)
– Timeline consistency
10
11. Key Design Decisions : Systems
Isolate fast consumers from slow consumers
– Workload separation between online(recent), catch-up (old),
bootstrap (all)
Isolate sources from consumers
– Schema changes
– Physical layout changes
– Speed mismatch
Schema-awareness
– Compatibility checks
– Filtering at change stream
11
12. The Components of Databus
12
DB
Change
Capture
Event Buffer
(In Memory)
change data
Consumer
Relay
Databus
Client
Application
online changes
Bootstrap
New
ApplicationConsistent
snapshot
Log Store
Snapshot
Store
online changes
Bootstrap
Consumer
older changes
Slow
Application
Metadata
13. Change Data Capture
Contains logic to extract
changes from source from
specified SCN
Implementations
– Oracle
Trigger-based
Commit ordering
Special instrumentation required
– MySQL
Custom-storage-engine based
EventProducer
start(SCN ) //capture changes from
specified SCN
SCN getSCN() //return latest SCN
Change Data Capture
SC
N
Database Schemas
14. MySQL : Change Data Capture
Databus 14
MySQL
Master
MySQL
Slave
MySql
replication
TCP
Channel
• MySQL Replication takes care of
• bin-log parsing
• Protocol between master and slave
• Handling restarts
• Relay
• Provides a TCP Protocol interface to push events
• Controls and Manages MySql Slave
Relay
15. Publish – Subscribe API
DB
Change
Data
Capture
Event Buffer
(In Memory)
publish
extract
(src,SCN)
Consumer
subscribe
(src,SCN)
EventBuffer
startEvents() //e.g. new txn
DbusEvent(enc(schema,changeData),src,pk)
appendEvent(DbusEvent, ...)
endEvents(SCN) //e.g. end of txn; commit
rollbackEvents() //abort this window
Consumer
register(source, ‘Callback’)
onStartConsumption() //once
onStartDataEventSequence(SCN)
onStartSource(src,Schema)
onDataEvent(DbusEvent e,…)
onEndSource(src,Schema)
onEndDataEventSequence(SCN)
onRollback(SCN)
onStopConsumption() //once
16. The Databus Change Event Stream
Event Buffer
(In Memory)
Relay
Bootstrap
Log Store
Snapshot
Store
online changes
• Provide APIs to obtain change events
• Query API specifies logical clock(SCN) and
source
• ‘Get change events greater than SCN’
• Filtering at source possible
• MOD, RANGE filter functions
applied to primary key of the event
• Batching/Chunking to guarantee
progress
• Does not contain state of consumers
• Contains references to metadata and
schemas
• Implementation
• HTTP server
• Persistent connection to clients
• REST API
Change Event Stream
17. Meta-data Management
Event definition, serialization and transport
– Avro
Oracle, MySQL
– Table schema generates Avro definition
Schema evolution
– Only backwards-compatible changes allowed
Isolation of applications from changes in source schema
Many versions of a source used by applications , but one
version(latest) of the change stream exists
18. The Databus Relay
Change
Capture
Event Buffer
(In Memory)
Relay
Database Schemas
Src
Meta-
data
• Encapsulates change capture logic and
change event stream
• Source aware, schema aware
• Multi-tenant: Multiple Event Buffers
representing change events of different
databases
• Optimizations
• Index on SCN exists to quickly
locate physical offset in EventBuffer
• Locally stores SCN per source for
efficient restarts
• Large Event Buffers possible (> 2G)
SCN
store
API
19. Scaling Databus Relay
DB
Relay Relay Relay
• Peer relays, independent
• Increased load on the source
DB with each additional relay
instance
DB
Relay
Leader
Relay
(Follower
)
• Relays in leader-follower cluster
• Only the leader reads from DB ,
followers from leader
• Leadership assigned dynamically
• Small period of stream
unavailability during leadership
transfer
Relay
(Follower
)
20. The Bootstrap Service
Bridges the continuum between stream and
batch systems
Catch-all for slow / new consumers
Isolate source instance from large scans
Snapshot store has to be seeded once
Optimizations
– Periodic merge
– Filtering pushed down to store
– Catch-up versus full bootstrap
Guaranteed progress for consumers via
chunking
Multi-tenant - can contain data from many
different databases
Implementations
– Database (MySQL)
– Raw Files
Relay
Bootstra
p
Log Store
Snapshot
Store
online changes
Bootstrap
Consumer
seeding
Database
21. The Databus Client Library
Glue between Databus Change
Stream and business logic in the
Consumer
Switches between relay and bootstrap
as needed
Optimizations
– Change events uses batch write
API without deserialization
Periodically persists SCN for lossless
recovery
Built-in support for parallelism
– Consumers need to be thread-safe
– Useful for scaling large batch processing
(bootstrap)
EventBuffer
Databus Change
Stream
Change
Stream Client
SCN
store
API
Dispatcher
Stream
Consumer
Bootstrap
Consumer
iterate
write
callback
read
Databus Client Library
23. Client
Application
(i=1..k)
Client
Application
(k+1..N)
Change Stream
i= pk MOD N
(i=0..k-1)
(i=k..N-1)
• Databus Clients consume partitioned streams
• Partitioning strategy: Range or Hash
• Partitioning function applied at source
• Number of partitions (N) , and list of partitions (i) specified
statically in configuration
• Not easy to add/remove nodes
• Needs configuration change on all nodes
Client nodes uniform:
can process any
partition(s)
Clients distribute
processing load
Scaling Applications - I
24. Client
Application
N/m partitions
Application
N/m
partitions
Databus Stream
i= pk mod N
Dynamically
allocated
partitions
N partitions distributed
evenly amongst ‘m’
nodes
SCN written to central
location
• Databus Clients consume partitioned streams
• Partitioning strategy: MOD
• Partition function applied at source
• Number of partitions (N) , and cluster name specified
statically in configuration
• Easy to add or remove nodes
• Dynamic redistribution of partitions
• Fault tolerance for client nodes
Scaling Applications - II
25. Databus: Current Implementation
OS - Linux, written in Java , runs Java 6
All components have http interfaces
Databus Client: Java
– Other language bindings possible
– All communication with change stream via http
Libraries
– Netty , for http client-servers
– Avro , for serialization of change events
– Helix , for cluster awareness
28. Databus Performance : Relay
Relay
– Saturates network with low CPU utilization
CPU utilization increases with more clients
Increased poll interval (increase consumer latency ) reduces CPU
utilization
– Scales to 100’s of consumers (client instances)
30. Databus Performance : Consumer
Consumer
– Latency primarily governed by ‘poll interval’
– Low overhead of library in event fetch
Spike in latency due to network saturation at relay
Scaling number of consumers
Use partitioned consumption (filtering at relay )
– Reduces network utilization , but some increase in latency due to
filtering
Increase ‘poll interval’ , tolerate higher latencies
33. Databus Bootstrap :Performance
Bootstrap
– Should we serve from ‘catchup store’ or ‘snapshot store’
– Depends: Traffic patterns in the spectrum ‘all updates’ , ‘all
inserts’
– Tune service depending on fraction of update and inserts
Favour snapshot based serving for update heavy traffic
35. M
Oracle Change
Event
Stream
M
Espresso
Change Event
Event Stream
Databus
Service
• Databus Change Stream is a
managed service
• Applications discover/lookup
coordinates of sources
• Multi-tenant , chained relays
• Many sources can be
bootstrapped from SCN 0
(beginning of time)
• Automated change stream
provisioning is a work in
progress
Databus at LinkedIn
36. Databus at LinkedIn : Monitoring
Available out of the box as JMX Mbean
Metrics for health
– lag between update time at DB and the time at which it was
received by application
– time of last contact to change event stream and source
Metrics for capacity planning
– Event rate/ size
– Request rate
– Threads/ conns
37. Databus at LinkedIn: The Good
Source isolation: Bootstrap benefits
– Typically, data extracted from sources just once (seeding)
– Bootstrap service used during launch of new applications
– Primary data store not subject to unpredictable high loads due to
lagging applications
Common Data Format
– Avro offers ease-of-use , flexibility and performance
improvements (larger retention periods of change events in
Relay)
Partitioned Stream Consumption
– Applications horizontally scaled to 100’s of instances
38. Databus at LinkedIn: Operational Niggles
Oracle Change Capture Performance Bottlenecks
– Complex joins
– BLOBS and CLOBS
– High update rate driven contention on trigger table
Bootstrap: Snapshot store seeding
– Consistent snapshot extraction from large sources
Semi-automated change stream provisioning
39. Quick Review
Specialization in Data Systems
– CDC pipeline is a first class infrastructure citizen up there with
stores and indexes
Source Independent
– Change capture logic can be plugged in
Use of SCN – an external clock attached to source
– Makes change stream more ‘portable’
– Easy for applications to reason about consistency with source
Pub-Sub API support atomicity semantics of transactions
Bootstrap Service
– Isolates the source from abusive scans
– Serves both streaming and batch use-cases
39
43. Databus: First attempt (2007)
Issues
Source database pressure
caused by slow consumers
Brittle serialization
Editor's Notes
Large scale storage systems, by nature, are distributed. Data is stored in multiple machines. Also, it is often the case that it is the same data (I mean, same content) that is stored in different places to support different access patterns, efficient retrieval, or for quick look up of derived data (as opposed to computing them during a look-up). In order to have such distributed systems work together to provide a service, two things are needed: CLICKData flow between these systems: When data is updated in one side on one system, it should be reflected in the other parts that store the same content. CLICKData consistency: The different parts of the system must converge at some point in time.So, we need a change capture system that supports such a distributed system.
Basically there are two ways changes can be captured. Applications can dual-write data into the database and the change streamWe capture the data change list from the commit logs that most databases have.CLICKDual writes appear really easy on the surface. But then when we start considering the transient failure scenarios, achieving consistency gets harder, sometimes impossible. You may need to get into 2P commits, etc. compromising on performance, or availability.CLICKOn the other hand, extracting changes from the database is almost like post-processing. Minimal or no performance penalty, application is unaware of the existence of a change capture system, and consistency is not an issue as long as you see the commit logs in entirety. No such thing as a free lunch, of course. Commit logs formats are proprietary, and extracting from them can be tough.We chose to take this approach of change extract.
Here are some of the use cases for Databus. In this picture, we have one source of truth for data – the primary database. Changes to the data are observed (or, consumed) by consumers, which may the turn around and update derived data serving systems. Or, data may be extracted into Hadoop (for example), to be re-loaded into some derived systems. At Linked in, we our primary database is in Oracle . For example, we have a database that holds member information. When rows in these databases are altered, the change events need to be propagated to a search index. The search index is used to serve queries from recruiters looking for appropriate candidates. Similarly there are other consumers of different databases, each having their own business logic to build derived data out of the primary database.Databus is used to capture these changes and provide them as change events to consumers.So, what would be the requirements of such a system?
… Let’s look at a brief biography of Databus
… Now for a close look at Databus
Use of ‘change’ in ‘change data’ more of a noun – rather than a verb – ‘data that has changed’ rather than alrer the data. – Industry std term.Change capture logic (CDC ) to extract changes – in a consistent way (preserving the consistency, ordering -> e.g. ways to extract order of commits, I D semantics)Publisher and Subscriber APIs that lets the CDC transform he extracted changes and publish those events with atomicity guarantees of the source... Applications preserved consistency when they applied the changes they received in a timely manner… and there were realities
Different type of applicationsSchemas evolving at sourceButSource cannot be burdened (a typical problem with V1 )Applications cannot be forced to move to latest version ( resulted in proliferation of different versions of change streams of same source)
External clock attached to source. Ordering defined by the source- e.g. commit ordering in Oracle -> increasing SCN – in mysqlbinlog increasing txns could be the scnsNo additional source of truth, no additional point of failureAbility to recreate event stream given SCN and sourceFor applications ; the ordering of events is same as that seen by the source – so eventually the source and apps will converge , SCN is used to track progress on the app, Apps can reason about consistency with source- as external clock SCN is used , SCN is logical – not tied to any particular change stream node, Apps need to be idempotent – as they can see a change more than once .Apps can reason about consistency – derived stores can reason amongst each other as they have SCN visibility – a concept that is useful to compare consistency across applicationsTimeline consistency: at least once guarantee – order of change events same as source db; no updates missed. SCN; all apps listening to the change stream see the same order of change events
Pul model - > as opposed to push – where producers keep track of their consumer progress- and call clients as long as they are available, pull model assumes the state required to servr a request lies with the consumer. Restartability is easier as state can be computed from source,SCN on any machine. This Is true at both change event stream and the consumer.Separation of concerns – between use cases of ‘online consumption’ – recent changes and ‘catchup/ bootstrap ‘ case – where older changes are required – different scalability properties IsolIsolate Sources and consumers – source can move , schemas can change, and of couse, producer speed and consumption speeds can vastly differWe are not just transport – we support meta data – such as schemas. We ensure that consumers have a good experience – while the change stream also becomes more managable – ultimately helping provisioning and consumer robustness.Also gives an option of adding more filtering options at the change stream.
Point: Change Capture is within the relay: each relay is self-sufficient, i.e. since eventBufferState = fn(source,SCN) – it has change capture logic to pull in the changes – if change capture were outside, then the change capture logic ,fan out has to take care of replication or write to a leader-follower relay cluster. EventBuffer wraps around if it runs out of memory.Point: Client Library: fetches changes from the databus stream (which is now Relay and Bootstrap Service) - . Point: Workload separation : cases – recent change, to older changes, snapshots – cannot rely solely on all changes fitting into memoryPoint: Bootstrap Consumer: special application that listens to changes and updates it’s log store (persistent change events) and snapshot store – (persistent copy of the database – stores change events in user-space)Point: Remember , client library automatically switches between appropriate service – relay or bootstrap depending on SCN requested by application - Point: Meta data: used by relay stream (schema awareness) Point: DB’s are saved from abusive scans from lagging consumers , (isolation) Counterpoints:Push-model -> addl state of consumers; or speed of consumption / speed of production have to match ; harder to maintain lossless guarantees;
Are all of these open-sourced? Oracle is.
Custom mysqlrepl /mysql slave instance– the Specifically custom storage engine of slave writes to tcp channel instead of disk – Slave state has an SCN (offset , log number) – that can be controlled – upto 3 days worth of rewindability (configurable)
Control Flow is depicted:Note the pull-model - .> at the CDC end, it’s easy to make data portable – the SCN ,Source is sufficient to re-create state in EventBuffer; easier restarts.; no state requires to be maintained on upstream system about it’s subscribers (as in case of push model).- Publish does not require persistence/durability guarantees – obtained from source of truth and the fact that change stream is a f(Source,SCN) .At end points –the CDC catpures changes from database and publishes to an event bufferAt the other end – applications subscribe to the change stream and receive callbacks when change data from the sources they have subscribed to become availablePoint: End points have API’s supporting transaction semantics (atomicity)Point: ‘Window’ or consistency window are points in the stream that are consistent with source at the specified SCN. Point: Consumers ‘see’ events one consistency window at a time –i.e. they are visible to the consumers after ‘end of window’ has been written.Questions: What if CDC was outside-> CDC can be a pull-model – but can they push-off box to event stream – Yes, but then event stream isn’t simple – cluster state (leader-follower) is shared amongst CDC and Relay. Also failure of CDC leads needs to be treated and monitored separately. Packaging CDC with event stream has operability advantages. Questions: is onRollback() triggered at same time rollback was shown in buffer? – No – this isn’t about one-to-one correspondence in time – but one in semantics - both have notion of ‘transactions’ – apps don’t see events not committed – end users- output of apps have the option of seeing the whole txn in entirety as well – very important for example in relay-chaining.
- Both Relay (Online changes) and Bootstrap (Older changes) together constitute the change event streamDo not share the same exact API – but semantically say the same thingGet events since a point in logical clock Both have ability to perform simple filters on the service side- Both have chunking/progress guarantees-HTTP based implementation-Efficient communication to clients
Database schemas to some neutral format for the databus events , we chose AvroTools are available to publish schema to ‘schema registry’ , schema generation from different source types.Schemas generated and stored in a place accessible by Change Stream – backward compatibility Tools are available to publish schema to ‘schema registry’ , schema generation from different source types.Ensures backward compatibility – relevant for bootstrap.Schemas available to consumers to deserialize.
Relay enacapsulates change capture logic – event buffer (remember the publish API) – implemented as a circular buffer - and the meta dataConstitutes the online – most frequently used part of the change stream – addreses 98% of requests on a typical day.
Relays talk to database directly – since they contain the change capture. This has horizontal scalability limits.
-Bootstrap Consumer : is a special application that consumes events from relays and writes to a persistence layer called ‘log store’ . -Another process applies changes to snapshot store – how – using the pk that was there in the publish API. Separate thread not shown here.-Seeding: bootstrapping the bootstrap
Databus Client Library: -Orchestrate consumption of change stream from bootstrap/relay-Uses a http fetch to get events from upstream / write to eventbuffer using efficient readEvents call-Currently uses polling mechanism to get events from upstream-Dispatcher uses iterator interface of EventBuffer to read the events and then call the user specified consumer implementations.-Client library by default persists SCN for lossless recovery.-Consumers need to be thread-safe, can take advantage of parallelism .… let’s look at a typical application
-Key: a single instance of client library can handle – multiple consumers subscribing to multiple change streams Different logic tuning required for bootstrap and online case - facilities providedSchema aware apps can force type conversion from one schema to another – as long as backward compatibility is preserved amongst change data. -Override of persisted SCN possible : cases where flush() is not guaranteed by the application (e.g. index - ) – so , apps store the SCN in the index; retrieve it on startup-Applications typically are distributed – so they have notion of some sort of partitions/partition awareness. It can be tempting to consume the entire event stream of an unpartitioned upstream store and then drop n-1/n th partition on the floor.It’s inefficient and expensive (for relay and consumer – latency wise as we shall see) – Instead…
- Here- client nodes refer to one instance of the client library – so that can be an application instance-Applications themselves are partition aware – write to partitioned indexes/storesNeed to distribute processing loadPartition function -> applied at source on the primary key - this is applied on the fly – and the source itself neednt’ be partitioned.- Partitions can be changed as more nodes are applied if the application accounts for ‘repartitioning’ . Checkpoints need to be reset, configuration needs to be changed.But this is hardly operation friendly..
The clients are partition aware- but the partition assignment is dynamicCluster awareness is introduced , client app clusters Operability advantages – ability to add/remove nodes with dynamic redistribution -Helix used to manage client clusters , and as SCN store ..….Now, Let’s look at some aspects of the current implementation.
… And on to some code – let’s take a look at the application
-Points to note: - how a source is specified. -How sources are specified-A consumer: - and a databus client uses subscription (register);
Key – show how payload is extracted. The API’s we have visited them earlier. … Now to dwell on performance
Setup:- Measure relay serving throughput and CPU utilization- Vary number of consumers and poll interval tpt_10 means throughput with a poll interval of 10ms, cpu_55 is cpu utilization with 55ms poll interval, etc.)- Consumers pulling at max speed (no additional processing)- Event size is 2.5 KBNo write traffic -- relay buffer pre-filledThe hypothesis was that we can support more consumers if the poll interval is long, and that is confirmed by the Observations:- Relay can easily saturate the network with minimal CPU utilization- Once network is saturated, CPU increases with number of consumers due to networking overhead (context switching)- Even with 200 consumers, CPU utilization is less than 10%- Higher poll intervals generally lead to less CPU utilization
Setup:- Measure read throughput of each consumer with update traffic on the relay- Vary the number of consumers and update rate- Consumers pulling at max speed (no additional processing)- Poll interval is 10 msEvent size is 2.5 KBObservations:- Drops mean consumer no longer being able to keep up- Reason is network saturation on the relay side; e.g. 2000 update/s * 20 consumers * 2.5KB = 100MBps < max network bandwidth < 200MBps = 2000 update/s * 40 consumers * 2.5KB
Setup:- Same as above but measure time in milliseconds for events to reach consumer- Added partitioning through server-side filtering to see what happens if network is not a bottleneckObservations:- Latency knees due to relay network saturation as before- Latency without SSF (Server side filtering) is around 10-20 ms (including an average 5ms overhead due to the poll interval)With SSF network is no longer a bottleneck; latency up to 15-45ms due to SSF computation overheadSo, the relay can scale to hundreds of consumers if they can tolerate a little bit of latency.
E2E latency has no meaning for bootstrap service, and they can easily saturate the network with multiple clients. So, we focused on comparing the serves out log store vs snapshot store.Setup:* Compare serving deltas vs serving all updates* Synthetic work load* Vary number of updates to existing keys vs new keys (i.e. inserts)Observations:* Catchup time is constant as it does not distinguish updates vs inserts* Break even point is around 1:1 updates vs insertsFor a small number of inserts, the benefit of snapshot is overwhelmingThe breakeven point seems to be when ½ of the changes are updates. We monitor the update rate in production and tune the bootstrap service.
Databusstream for Oracle : things that scale with memberId and things that scale with connections (multiplicative , only inserts) small sources – advertiser data (but consistency important)Applications: search – multiple instances – large distributed deployment – low latency requirement – consistency Bootstrap used in new ways – used to automatically provision new index nodes; new in memory in advertising data sets ; usef to fix legacy stores Espresso: source of truth 2013 and beyondPartitioned primary data store (transactional) based on Mysql store engine ; horizontally scalable ; change stream partitioned at source-of-truth rather than change-stream -> Change stream still requires trigger based ‘databusification’ in Oracle. Relay provisioning is still manual – in the sense – there is no self-serve mechanism of specify a source /automatic source discovery- that will triggerRelays being provisioned in the ‘cloud’ – depending on capacity estimates.Let’s look at some change capture implementations we have.. .
-
Overall: is external clock propagation a good idea overall? Is it necessary or a nice to have? - It becomes important in case of bootstrap ?Are checkpoints portable? If a mapping exists between SCN->CDC-GEN-UNIQ-NUM , or if an index exists at every layer-bootstrap and relay for SCN – then it can be handled as system levelImpl – and the client needn’t use SCN explicitly. SCN – external clock is a convenient way of storing logical state across instances of the change stream.
Point: Change Capture is within the relay: each relay is self-sufficient, i.e. since eventBufferState = fn(source,SCN) – it has change capture logic to pull in the changes – if change capture were outside, then the change capture logic ,fan out has to take care of replication or write to a leader-follower relay clusterPoint: Client Library: fetches changes from the databus stream (which is now Relay and Bootstrap Service) - . Point: Separation of use cases – recent change, to older changes, snapshots – cannot rely solely on all changes fitting into memoryPoint: Bootstrap Consumer: special application that listens to changes and updates it’s log store (persistent change events) and snapshot store – (persistent copy of the database – stores change events in user-space)Point: Remember , client library automatically switches between appropriate service – relay or bootstrap depending on SCN requested by applicationPoint: DB’s are saved from abusive scans from lagging consumers