Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching Abstraction #211

Merged
merged 8 commits into from
Feb 6, 2016
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions docs/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,6 @@
"nuclear-js": "^1.0.5",
"webpack": "^1.9.11",
"webpack-dev-server": "^1.9.0",
"grunt-concurrent": "^1.0.0",
"grunt-contrib-connect": "^0.10.1",
"remarkable": "^1.6.0",
"front-matter": "^1.0.0",
"glob": "^5.0.10",
Expand Down
2 changes: 2 additions & 0 deletions src/reactor.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import Immutable from 'immutable'
import createReactMixin from './create-react-mixin'
import * as fns from './reactor/fns'
import { DefaultCache } from './reactor/cache'
import { isKeyPath } from './key-path'
import { isGetter } from './getter'
import { toJS } from './immutable-helpers'
Expand All @@ -27,6 +28,7 @@ class Reactor {
const baseOptions = debug ? DEBUG_OPTIONS : PROD_OPTIONS
const initialReactorState = new ReactorState({
debug: debug,
cache: config.cache || DefaultCache(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you envision using LRU cache? Bundled with nuclear? If so how is it exposed to end user?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you envision using LRU cache?

As an opt-in configuration change that is passed when Reactor is created.

If so how is it exposed to end user?

Probably punt that to @jordangarcia for best way to expose

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking probably exposing both the DefaultCache and LRUCache on the Nuclear object, and allowing one to be passed into creating a new reactor.

// merge config options with the defaults
options: baseOptions.merge(config.options || {}),
})
Expand Down
201 changes: 201 additions & 0 deletions src/reactor/cache.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
import { List, Map, OrderedSet, Record } from 'immutable'

export const CacheEntry = Record({
value: null,
storeStates: Map(),
dispatchId: null,
})

/*******************************************************************************
* interface PersistentCache {
* has(item)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are lookup, has, and hit all necessary? Is a get method insufficient?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lookup and has are simply analogous to get and has in an Immutable.Map.

The reason why there seems to be extra operations is because nuclear is designed to be used with a persistent data structure for the cache. For a caching abstraction to encapsulate something like LRU, the cache has to record when an item is read from cache. Since lookup returns the actual value in cache and we can't perform side-effects, another operation is required to update the cache to track LRU.

I meant to add a comment in the code, but this interface is modeled directly after the caching protocol in clojure.core.cache, which is also for immutable data caches

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I figured the reason was to keep the cache construct immutable, IMO would be simpler for developers to allow mutability and go with get + put

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial approach had this, but changing from a immutable to mutable cache had negative implications (e.g. when reactor evaluates using previous state causing cache churn)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, makes sense

* lookup(item, notFoundValue)
* hit(item)
* miss(item, entry)
* evict(item)
* asMap()
* }
*******************************************************************************/

/**
* Plain map-based cache
*/
export class BasicCache {

/**
* @param {Immutable.Map} cache
*/
constructor(cache = Map()) {
this.cache = cache;
}

/**
* Retrieve the associated value, if it exists in this cache, otherwise
* returns notFoundValue (or undefined if not provided)
* @param {Object} item
* @param {Object?} notFoundValue
* @return {CacheEntry?}
*/
lookup(item, notFoundValue) {
return this.cache.get(item, notFoundValue)
}

/**
* Checks if this cache contains an associated value
* @param {Object} item
* @return {boolean}
*/
has(item) {
return this.cache.has(item)
}

/**
* Return cached items as map
* @return {Immutable.Map}
*/
asMap() {
return this.cache
}

/**
* Updates this cache when it is determined to contain the associated value
* @param {Object} item
* @return {BasicCache}
*/
hit(item) {
return this;
}

/**
* Updates this cache when it is determined to **not** contain the associated value
* @param {Object} item
* @param {CacheEntry} entry
* @return {BasicCache}
*/
miss(item, entry) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd have expected this to be called put

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed it's awkward/jarring at first, but hopefully the reply above clears up where "hit" and "miss" come from.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah API is a bit strange, but is a proven API in a functional context (clojure) which I really like.

return new BasicCache(
this.cache.update(item, existingEntry => {
if (existingEntry && existingEntry.dispatchId > entry.dispatchId) {
throw new Error("Refusing to cache older value")
}
return entry
})
)
}

/**
* Removes entry from cache
* @param {Object} item
* @return {BasicCache}
*/
evict(item) {
return new BasicCache(this.cache.remove(item))
}
}

/**
* Implements caching strategy that evicts least-recently-used items in cache
* when an item is being added to a cache that has reached a configured size
* limit.
*/
export class LRUCache {

constructor(limit = 1000, cache = new BasicCache(), lru = OrderedSet()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would like to see a constant rather than inlined 1000

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is a simple default value. Do you have a use case in mind that it would be useful to reference this value somewhere else?
I'm somewhat reluctant to provide a default value here at all. I think that its reasonable to ask library user to specify what's best for their needs.
More than happy to define & export a const if you think it's useful

this.limit = limit;
this.cache = cache;
this.lru = lru;
}

/**
* Retrieve the associated value, if it exists in this cache, otherwise
* returns notFoundValue (or undefined if not provided)
* @param {Object} item
* @param {Object?} notFoundValue
* @return {CacheEntry}
*/
lookup(item, notFoundValue) {
return this.cache.lookup(item, notFoundValue)
}

/**
* Checks if this cache contains an associated value
* @param {Object} item
* @return {boolean}
*/
has(item) {
return this.cache.has(item)
}

/**
* Return cached items as map
* @return {Immutable.Map}
*/
asMap() {
return this.cache.asMap()
}

/**
* Updates this cache when it is determined to contain the associated value
* @param {Object} item
* @return {LRUCache}
*/
hit(item) {
// if item exists, remove it first to reorder in lru OrderedSet
const lru = this.cache.lookup(item) ?
this.lru.remove(item).add(item) :
this.lru;

return new LRUCache(this.limit, this.cache, lru)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels weird to me. Why do we need to return an entirely new cache? Looks like we are storing a pointer to the LRU cache object, so seems like we could point this.cache and this.lru to new Immutable objects rather than creating an entirely new LRU object?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to return an entirely new cache?

In the general sense, we need to return new cache because currently nuclear cache in immutable. There are cases in the code where we run evaluate() on previous values of ReactorState. If the cache is mutable, a cached value might be from the future and you have to detect that.

That general point aside, you're right that I don't always need a new cache here. If the item isn't actually in the cache and you call hit (which arguably could throw an error instead), I can just return this. Will fix.

}

/**
* Updates this cache when it is determined to **not** contain the associated value
* If cache has reached size limit, the LRU item is evicted.
* @param {Object} item
* @param {CacheEntry} entry
* @return {LRUCache}
*/
miss(item, entry) {
if (this.lru.size >= this.limit) {
// TODO add options to clear multiple items at once
const evictItem = this.has(item) ? item : this.lru.first()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

won't be very performant if at the limit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, theres a TODO on the line above to do something similar to your LRU that clears out a configured percentage of cache


return new LRUCache(
this.limit,
this.cache.evict(evictItem).miss(item, entry),
this.lru.remove(evictItem).add(item)
)
} else {
return new LRUCache(
this.limit,
this.cache.miss(item, entry),
this.lru.add(item)
)
}
}

/**
* Removes entry from cache
* @param {Object} item
* @return {LRUCache}
*/
evict(item) {
if (!this.cache.has(item)) {
return this;
}

return new LRUCache(
this.limit,
this.cache.evict(item),
this.lru.remove(item)
)
}
}

/**
* Returns default cache strategy
* @return {BasicCache}
*/
export function DefaultCache() {
return new BasicCache()
}
94 changes: 30 additions & 64 deletions src/reactor/fns.js
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import Immutable from 'immutable'
import logging from '../logging'
import { CacheEntry } from './cache'
import { isImmutableValue } from '../immutable-helpers'
import { toImmutable } from '../immutable-helpers'
import { fromKeyPath, getStoreDeps, getComputeFn, getDeps, isGetter } from '../getter'
Expand Down Expand Up @@ -330,22 +331,21 @@ export function evaluate(reactorState, keyPathOrGetter) {
}

// Must be a Getter
// if the value is cached for this dispatch cycle, return the cached value
if (isCached(reactorState, keyPathOrGetter)) {
// Cache hit
return evaluateResult(
getCachedValue(reactorState, keyPathOrGetter),
reactorState
)
}

// evaluate dependencies
const args = getDeps(keyPathOrGetter).map(dep => evaluate(reactorState, dep).result)
const evaluatedValue = getComputeFn(keyPathOrGetter).apply(null, args)
const cache = reactorState.get('cache')
var cacheEntry = cache.lookup(keyPathOrGetter)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think cacheLookup here should also handle the resolution of storeStates versus externalizing it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the cache is a hit on a getter then subsequent evalutes will always think the value is a hit.

Maybe I am missing something here, but it seems like this is wrong in a lot of cases.

// store state 1
var result1 = reactor.evaluate(getter)
reactor.dispatch('action')
// store state 2
var results2 = reactor.evaluate(getter)
// it would seem that since the reactor cache has an entry for `getter` than it will always be a hit

Is there a test case to verify this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the cache is a hit on a getter then subsequent evalutes will always think the value is a hit.

Ah, good catch. This should be

const isCacheMiss = !cacheEntry || isDirtyCacheEntry(reactorState, cacheEntry)
if (isCacheMiss) {
  // ...
}
// ...

I'm surprised this isn't covered by a test case already. Will add one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think cacheLookup here should also handle the resolution of storeStates versus externalizing it?

The caches are generalized right now and don't have nuclear specific logic. I felt like this was still the right place for that logic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const isCacheMiss = !cacheEntry
if (isCacheMiss || isDirtyCacheEntry(reactorState, cacheEntry)) {
cacheEntry = createCacheEntry(reactorState, keyPathOrGetter)
}

return evaluateResult(
evaluatedValue,
cacheValue(reactorState, keyPathOrGetter, evaluatedValue)
cacheEntry.get('value'),
reactorState.update('cache', cache => {
return isCacheMiss ?
cache.miss(keyPathOrGetter, cacheEntry) :
cache.hit(keyPathOrGetter)
})
)
}

Expand Down Expand Up @@ -375,57 +375,31 @@ export function resetDirtyStores(reactorState) {
return reactorState.set('dirtyStores', Immutable.Set())
}

/**
* Currently cache keys are always getters by reference
* @param {Getter} getter
* @return {Getter}
*/
function getCacheKey(getter) {
return getter
}

/**
* @param {ReactorState} reactorState
* @param {Getter|KeyPath} keyPathOrGetter
* @return {Immutable.Map}
* @param {CacheEntry} cacheEntry
* @return {boolean}
*/
function getCacheEntry(reactorState, keyPathOrGetter) {
const key = getCacheKey(keyPathOrGetter)
return reactorState.getIn(['cache', key])
}
function isDirtyCacheEntry(reactorState, cacheEntry) {
const storeStates = cacheEntry.get('storeStates')

/**
* @param {ReactorState} reactorState
* @param {Getter} getter
* @return {Boolean}
*/
function isCached(reactorState, keyPathOrGetter) {
const entry = getCacheEntry(reactorState, keyPathOrGetter)
if (!entry) {
return false
}

const storeStates = entry.get('storeStates')
if (storeStates.size === 0) {
// if there are no store states for this entry then it was never cached before
return false
}

return storeStates.every((stateId, storeId) => {
return reactorState.getIn(['storeStates', storeId]) === stateId
// if there are no store states for this entry then it was never cached before
return !storeStates.size || storeStates.some((stateId, storeId) => {
return reactorState.getIn(['storeStates', storeId]) !== stateId
})
}

/**
* Caches the value of a getter given state, getter, args, value
* Evaluates getter for given reactorState and returns CacheEntry
* @param {ReactorState} reactorState
* @param {Getter} getter
* @param {*} value
* @return {ReactorState}
* @return {CacheEntry}
*/
function cacheValue(reactorState, getter, value) {
const cacheKey = getCacheKey(getter)
const dispatchId = reactorState.get('dispatchId')
function createCacheEntry(reactorState, getter) {
// evaluate dependencies
const args = getDeps(getter).map(dep => evaluate(reactorState, dep).result)
const value = getComputeFn(getter).apply(null, args)

const storeDeps = getStoreDeps(getter)
const storeStates = toImmutable({}).withMutations(map => {
storeDeps.forEach(storeId => {
Expand All @@ -434,19 +408,11 @@ function cacheValue(reactorState, getter, value) {
})
})

return reactorState.setIn(['cache', cacheKey], Immutable.Map({
return CacheEntry({
value: value,
storeStates: storeStates,
dispatchId: dispatchId,
}))
}

/**
* Pulls out the cached value for a getter
*/
function getCachedValue(reactorState, getter) {
const key = getCacheKey(getter)
return reactorState.getIn(['cache', key, 'value'])
dispatchId: reactorState.get('dispatchId'),
})
}

/**
Expand Down
3 changes: 2 additions & 1 deletion src/reactor/records.js
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import { Map, Set, Record } from 'immutable'
import { DefaultCache } from './cache'

export const PROD_OPTIONS = Map({
// logs information for each dispatch
Expand Down Expand Up @@ -38,7 +39,7 @@ export const ReactorState = Record({
dispatchId: 0,
state: Map(),
stores: Map(),
cache: Map(),
cache: DefaultCache(),
// maintains a mapping of storeId => state id (monotomically increasing integer whenever store state changes)
storeStates: Map(),
dirtyStores: Set(),
Expand Down
Loading