diff options
| author | Brian S. O'Neill <bronee@gmail.com> | 2008-07-06 19:52:52 +0000 | 
|---|---|---|
| committer | Brian S. O'Neill <bronee@gmail.com> | 2008-07-06 19:52:52 +0000 | 
| commit | f1a2e3e954f2c54db94206a842cc4e87dbaa5c50 (patch) | |
| tree | eb45b72842779d4ee52708bb030b25b12f4ea859 /src/site/fml | |
| parent | 244222bfad9ef7fa85143d24ecdd7e2fda461a79 (diff) | |
Link to Wiki FAQ.
Diffstat (limited to 'src/site/fml')
| -rw-r--r-- | src/site/fml/technical-faq.fml | 394 | 
1 files changed, 0 insertions, 394 deletions
| diff --git a/src/site/fml/technical-faq.fml b/src/site/fml/technical-faq.fml deleted file mode 100644 index 4d33328..0000000 --- a/src/site/fml/technical-faq.fml +++ /dev/null @@ -1,394 +0,0 @@ -<?xml version="1.0"?>
 -<faqs id="FAQ" title="Frequently Asked Technical Questions">
 - <part id="General">
 -
 -   <faq id="process-killed">
 -     <question>What happens when a Carbonado process is killed while in the middle of a transaction?</question>
 -     <answer>
 -       <p>
 -Carbonado uses shutdown hooks to make sure that all in progress transactions
 -are properly rolled back. If you hard-kill a process (kill -9), then the
 -shutdown won't get run. This can cause a problem when using BDB, and
 -db_recover must be run to prevent future data corruption. BDB-JE is not
 -affected, however, as it automatically runs recovery upon restart.
 -       </p>
 -     </answer>
 -   </faq>
 -
 -   <faq id="replicated-bootstrap">
 -     <question> How do I bootstrap a replicated repository?</question>
 -     <answer>
 -<p>
 -By running a resync operation programmatically. The ReplicatedRepository has a
 -ResyncCapability which has a single method named "resync". It accepts a
 -Storable class type, a throttle parameter, and an optional filter. Consult the
 -<a href="http://carbonado.sourceforge.net/apidocs/com/amazon/carbonado/capability/ResyncCapability.html">Javadocs</a> for more info.
 -</p>
 -<p>
 -In your application you might find it convenient to add an administrative
 -command to invoke the resync operation. This makes it easy to repair any
 -inconsistencies that might arise over time.
 -</p>
 -     </answer>
 -   </faq>
 -
 -   <faq id="deadlock">
 -     <question> I sometimes see lock timeout errors and deadlocks, what is going on?</question>
 -     <answer>
 -<p>
 -A lock timeout may be caused by an operation that for whatever reason took too
 -long, or it may also indicate a deadlock. By default, Carbonado uses a lock
 -timeout of 0.5 seconds for BDB based repositories. It can be changed by calling
 -setLockTimeout on the repository builder.
 -</p>
 -<p>
 -Deadlocks may be caused by:
 -</p>
 -<p>
 -<ol>
 -<li>Application lock acquisition order</li>
 -<li>BDB page split</li>
 -<li>Index update</li>
 -</ol>
 -</p>
 -<p>
 -In the first case, applications usually cause deadlock situations by updating a
 -record within the same transaction that previously read it. This causes the
 -read lock to be upgraded to a write lock, which is inherently deadlock
 -prone. To prevent this problem, switch the transaction to update mode. This
 -causes all acquired read locks to be upgradable, usually by acquiring a write
 -lock from the start.
 -</p>
 -<p>
 -Another cause of this deadlock is when you iterate over a cursor, updating
 -entries as you go. To resolve this, either copy the cursor entries to a list
 -first, or operate within a transaction which is in update mode.
 -</p>
 -<p>
 -The second case, BDB page split, is a problem in the regular BDB product.
 -It is not a problem with BDB-JE. When inserting records into a BDB,
 -it may need to rebalance the b-tree structure. It does this by splitting a leaf
 -node and updating the parent node. To update the parent node, a write lock must
 -be acquired but another thread might have a read lock on it while trying to
 -lock the leaf node being split.
 -</p>
 -<p>
 -There is no good solution to the BDB page split deadlock. Instead, your
 -application must be coded to catch deadlocks are retry transactions.  They are
 -more common when filling up a new BDB.
 -</p>
 -<p>
 -The third case, index updates, is caused by updating a record while another
 -thread is using the index for finding the record. Carbonado's indexing strategy
 -can be coded to defer index updates when this happens, but it currently does
 -not. In the meantime, there is no general solution.
 -</p>
 -<p>
 -Lock timeouts (or locks not granted) may be caused by:
 -</p>
 -<p>
 -<ol>
 -<li>Failing to exit all transactions</li>
 -<li>Open cursors with REPEATABLE_READ isolation</li>
 -<li>Heavy concurrency</li>
 -</ol>
 -</p>
 -<p>
 -If any transactions are left open, then any locks it acquired don't get
 -released. Over time the database lock table fills up. When using BDB, the
 -"db_stat -c" command can show information on the lock table. Running
 -"db_recover" can clear any stuck locks. To avoid this problem, always run
 -transactions within a try-finally statement and exit the transaction in the
 -finally section.
 -</p>
 -<p>
 -By default, BDB transactions have REPEATABLE_READ isolation level. This means
 -that all locks acquired when iterating cursors within the transaction are not
 -released until the transaction exits. This can cause the lock table to fill
 -up. To work around this, enter the transaction with an explicit isolation level
 -of READ_COMMITTED which releases read locks sooner.
 -</p>
 -<p>
 -Applications that have a high number of active threads can cause general lock
 -contention. BDB-JE uses row-level locks, and so lock timeouts caused by
 -contention are infrequent. The regular BDB product uses page-level locks, thus
 -increasing the likelyhood of lock contention.
 -</p>
 -     </answer>
 -   </faq>
 -
 -   <faq id="subselect">
 -     <question>How do I perform a subselect?</question>
 -     <answer>
 -<p>
 -Carbonado query filters do not support subselects, athough it can be
 -emulated. Suppose the query you wish to execute looks something like this in
 -SQL:
 -</p>
 -<p><pre>
 -select * from foo where foo.name in (select name from bar where ...)
 -</pre></p>
 -<p>
 -This can be emulated by querying bar, and for each result, fetching foo.
 -</p>
 -<p><pre>
 -// Note that the query is ordered by name.
 -Cursor<Bar> barCursor = barStorage.query(...).orderBy("name").fetch();
 -String lastNameSeen = null;
 -while (barCursor.hasNext()) {
 -    Bar bar = barCursor.next();
 -    if (lastNameSeen != null && lastNameSeen.equals(bar.getName()) {
 -        continue;
 -    }
 -    lastNameSeen = bar.getName();
 -    Foo foo = fooStorage.query("name = ?").with(lastNameSeen).tryLoadOne();
 -    if (foo != null) {
 -        // Got a result, do something with it.
 -        ...
 -    }
 -}
 -</pre></p>
 -<p>
 -For best performance, you might want to make sure that Foo has an index on its name property.
 -</p>
 -<p>
 -You may track the feature request <a href="http://sourceforge.net/tracker/index.php?func=detail&aid=1578197&group_id=171277&atid=857357">here</a>.
 -</p>
 -    </answer>
 -   </faq>
 -
 -   <faq id="table-generation">
 -     <question>Does Carbonado support generating SQL tables from Storables?</question>
 -     <answer>
 -<p>
 -No, it does not. Carbonado instead requires that your Storable definition
 -matches a table, if using the JDBC repository. When using a repository that has
 -no concept of tables, like the BDB repositories, the Storable is the canonical
 -definition. In that case, changes to the Storable effectively change the
 -"table". In addition, properties can be added and removed, and older records
 -can still be read.
 -</p>
 -<p>
 -Although it is technically feasible for Carbonado to support generating SQL
 -tables, Storable definitions are not expressive enough to cover all the
 -features that can go into a table. For example, you cannot currently define a
 -foreign key constraint in Carbonado.
 -</p>
 -     </answer>
 -   </faq>
 -
 -   <faq id="isnull">
 -     <question>How do I query for "IS NULL"?</question>
 -     <answer>
 -<p>
 -Carbonado treats nulls as ordinary values wherever possible, so nothing special
 -needs to be done. That is, just search for null like any other value. The query
 -call might look like:
 -</p>
 -<p><pre>
 -Query<MyType> query = storage.query("value = ?").with(null);
 -Cursor<MyType> = query.fetch();
 -...
 -</pre></p>
 -<p>
 -When using the JDBC repository, the generated SQL will contain the "IS NULL"
 -phrase in the WHERE clause.
 -</p>
 -     </answer>
 -   </faq>
 -
 -   <faq id="sql-debugging">
 -     <question>How do I see generated SQL?</question>
 -     <answer>
 -       <p>
 -To see the SQL statements generated by the JDBC repository, you can install a
 -JDBC DataSource that logs all activity. Provided in the JDBC repository package
 -is the LoggingDataSource class, which does this. As a convenience, it can be
 -installed simply by calling setDataSourceLogging(true) on the
 -JDBCRepositoryBuilder.
 -</p>
 -<p>
 -Alternatively, you can call Query.printNative(), which by default prints the
 -native query to standard out. When using the JDBC repository, this will print
 -the SQL statement.
 -       </p>
 -     </answer>
 -   </faq>
 -
 -   <faq id="mysql-increment">
 -     <question>How do I use MySQL auto-increment columns?</question>
 -     <answer>
 -       <p>
 -Carbonado version 1.1 has thin support for MySQL. Version 1.2 (in the 1.2-dev branch)
 -supports an @Automatic annotation for supporting MySQL auto-increment columns.
 -       </p>
 -     </answer>
 -   </faq>
 -
 -   <faq id="unique">
 -     <question>Can I do the equivalent of a "unique" constraint?</question>
 -     <answer>
 -       <p>
 -The @AlternateKeys annotation is provided specifically for this purpose. Both
 -@PrimaryKey and @AlternateKeys define unique indexes. The only real difference
 -is that there can be only one primary, but many alternate keys are allowed.
 -       </p>
 -     </answer>
 -   </faq>
 -
 -   <faq id="caching">
 -     <question>What kind of caching does Carbonado provide?</question>
 -     <answer>
 -       <p>
 -Carbonado does not require repository implementations to perform any
 -caching. If you're using just the JDBC repository, there's no cache. A general
 -purpose caching repository was in development, but it was shelved because there
 -was no immediate need for it. The replicated repository however, can be
 -considered to be a complete cache.
 -</p>
 -<p>
 -The only built in caching is for join properties on Storable instances. It just
 -lazily sets the join result to an internal field of the Storable instance. The
 -join property value is not shared with other Storable instances.
 -        </p>
 -     </answer>
 -   </faq>
 -
 -   <faq id="join-cache">
 -     <question>How does one manually flush the Carbonado join cache?</question>
 -     <answer>
 -
 -       <p>
 -The Carbonado join cache is a lazy-read cache, local to a Storable instance. It
 -is not a global write-through cache, and so no flushing is necessary.
 -</p>
 -<p>
 -The first time a join property has been accessed, a reference is saved in the
 -master Storable instance. This optimization makes the most sense when filtering
 -based on a join property. The query loads the join property, and you'll likely
 -want it too. This avoids a double load.
 -       </p>
 -
 -     </answer>
 -   </faq>
 -
 -   <faq id="evolution">
 -     <question>How can schemas evolve?</question>
 -     <answer>
 -       <p>
 -Independent repositories like BDB support automatic schema evolution. You may
 -freely add or remove non-primary key properties and still load older
 -storables. Changes to primary key properties is not supported, since they
 -define a clustered index. Also, property data types cannot be changed except if
 -a boxed property is changed to a non-boxed primitive and vice versa.
 -</p>
 -<p>
 -Every storable persisted by Carbonado in BDB starts with a layout version,
 -which defines the set of properties encoded. Carbonado separately persists the
 -mapping from layout version to property set, such that when it decodes a
 -storable it knows what properties to expect.
 -</p>
 -<p>
 -When adding or removing properties, existing persisted storables are not
 -immediately modified. If you remove a property and add it back, you can recover
 -the property value still encoded in the existing storables. Property values are
 -not fully removed from an existing storable instance until it is explicitly
 -updated. At this time, the layout version used is the current one, and the
 -removed property values are lost.
 -</p>
 -<p>
 -When loading a storable which does not have a newly added property, the
 -property value is either null, 0, or false, depending on the data type. You can
 -call the isPropertyUninitialized method on the storable to determine if this
 -default property value is real or not.
 -</p>
 -<p>
 -In order to change a property type to something that cannot be automatically
 -converted, the change must be performed in phases. First, define a new
 -property, with a different name. Then load all the existing storables and
 -update them, setting the new property value. Next, remove the old property. To
 -potentially free up storage you can update all the storables again. If you wish
 -the newly added property to retain the original name, follow these steps again
 -in a similar fashion to change it.
 -</p>
 -     </answer>
 -   </faq>
 -
 -   <faq id="iterate-all">
 -     <question>How do I iterate over all storable types in a repository?</question>
 -     <answer>
 -       <p>
 -Given a repository and an appropriately set classpath, can we iterate through
 -all the various storables held in the repository without actually knowing what
 -the repository might hold in advance?
 -</p>
 -<p>
 -Repositories that implement StorableInfoCapability provide this
 -functionality. The reason it's a capability is that some repos (JDBC) don't have
 -a registry of storables. BDB based ones do, and so this capability works for that.
 -</p>
 -     </answer>
 -   </faq>
 -
 -   <faq id="index-integrity">
 -     <question>Are explicit transactions required to ensure index integrity?</question>
 -     <answer>
 -       <p>
 -The short answer is no -- index integrity is ensured automatically. More details follow:
 -</p>
 -<p>
 -When using the JDBC repository, it is up to the database vendor to ensure that
 -insert/update/delete operations include index updates within an implicit
 -auto-commit transaction. All the major database vendors do this properly
 -already, so nothing special needs to be done here.
 -</p>
 -<p>
 -When using a BDB backed repository, it is up to Carbonado to ensure implicit
 -transactions are used. Carbonado sets up BDB to be in transaction mode, and
 -there's no Carbonado level config to disable this. So you're always using BDB
 -with transactions, and that is good. When you do a lone Carbonado
 -insert/update/delete operation, it will pass null to BDB for the transaction
 -object, which implies auto-commit. BDB will automatically enter a tiny
 -transaction to protect that little change.
 -</p>
 -<p>
 -If the Storable you're updating has any indexes on it, a Carbonado trigger is
 -installed that updates the affected indexes when you do an
 -insert/update/delete. The presence of the trigger changes how the
 -auto-generated Storable behaves. The insert/update/delete operation enters a
 -transaction automatically, and it doesn't commit until all triggers have
 -run. Index updates are therefore guarded by transactions, even if you don't
 -explicitly specify one. In addition, all changes made by your own triggers are
 -always guarded by a transaction.
 -</p>
 -     </answer>
 -   </faq>
 -
 -   <faq id="delete-from-cursor">
 -     <question> How do I delete Storables returned by a Cursor without deadlocks?</question>
 -     <answer>
 -       <p>
 -The cursor iteration and delete operations must be enclosed in the same
 -transaction. Auto-commit delete while iterating over a cursor fails for some
 -databases, BDB and BDB-JE in particular. Although BDB supports a delete
 -operation on the cursor itself, the transaction requirement remains.
 -</p>
 -<p>
 -A workaround exists when using BDB-JE, which works only due to its use of
 -record-level locks. Calling Cursor.hasNext() forces the cursor to move past the
 -current record, releasing the lock on the record to be deleted. BDB native uses
 -page locks, so this trick will only work in the occasional case that the next
 -record is on another page.
 -</p>
 -<p>
 -The BDB-JE cursor implementation could be changed to automatically move to the
 -next record, but this reduces portability. Also, the cursor should not move
 -past the current record automatically if in a transaction. It would allow
 -another thread to sneak in and modify the record. An isolation level of
 -repeatable read would be required to keep the lock.
 -</p>
 -     </answer>
 -   </faq>
 -
 - </part>
 -</faqs>
 | 
