IMVU would not be possible without the work of the thousands of developers that created these programs:
PHP Libraries:- ADODB
- b2evolution
- Coppermine
- feed2js
- FreeTag
- Incutio XML-RPC
- jrcache
- JSON-PHP
- Magpie
- osCommerce
- phpBB
- SimpleTest
- Sarissa
- Audiere
- Boost
- Cal3D (our changes)
- CFL
- NSIS
- Pixomatic
- Python
- pywin32
- SCons
- wxPython
- Apache
- BuildBot
- eAccelerator
- Linux (Debian)
- memcached
- MySQL
- Nagios
- Perl
- perlbal
- PHP
- Roundup
- rrd
- Solr
- Subversion
Last time, we talked about including contextual information to help us actually fix crashes that happen in the field. Minidumps are a great way to easily save a snapshot of the most important parts of a running (or crashed) process, but it's often useful to understand the low-level mechanics of a C++ call stack (on x86). Given some basic principles about function calls, we will derive the implementation of code to walk a call stack.
C++ function call stack entries are stored on the x86 stack, which
grows downward in memory. That is, pushing on the stack subtracts
from the stack pointer. The ESP register points to the
most-recently-written item on the stack; thus, push eax
is equivalent to:
sub esp, 4 mov [esp], eax
Let's say we're calling a function:
int __stdcall foo(int x, int y)
The __stdcall
calling convention pushes arguments onto the stack from right to left
and returns the result in the EAX register, so calling
foo(1, 2) generates this code:
push 2 push 1 call foo ; result in eax
If you aren't familiar with assembly, I know this is a lot to absorb,
but bear with me; we're almost there. We haven't seen the
call instruction before. It pushes the EIP
register, which is the return address from the called function onto
the stack and then jumps to the target function.
If we didn't store the instruction pointer, the called function would
not know where to return when it was done.
The final piece of information we need to construct a C++ call stack is that functions live in memory, functions have names, and thus sections of memory have names. If we can get access to a mapping of memory addresses to function names (say, with the /MAP linker option), and we can read instruction pointers up the call stack, we can generate a symbolic stack trace.
How do we read the instruction pointers up the call stack? Unfortunately, just knowing the return address from the current function is not enough. How do you know the location of the caller's caller? Without extra information, you don't. Fortunately, most functions have that information in the form of a function prologue:
push ebp mov ebp, esp
and epilogue:
mov esp, ebp pop ebp
These bits of code appear at the beginning and end of every function, allowing you
to use the EBP register as the "current stack frame".
Function arguments are always accessed at positive offsets from EBP,
and locals at negative offsets:
; int foo(int x, int y) ; ... [EBP+12] = y argument [EBP+8] = x argument [EBP+4] = return address (set by call instruction) [EBP] = previous stack frame [EBP-4] = local variable 1 [EBP-8] = local variable 2 ; ...
Look! For any stack frame EBP, the caller's address is
at [EBP+4] and the previous stack frame is at [EBP].
By dereferencing EBP, we can walk
the call stack, all the way to the top!
struct stack_frame {
stack_frame* previous;
unsigned long return_address;
};
std::vector<unsigned long> get_call_stack() {
std::vector<unsigned long> call_stack;
stack_frame* current_frame;
__asm mov current_frame, ebp
while (!IsBadReadPtr(current_frame, sizeof(stack_frame))) {
call_stack.push_back(current_frame->return_address);
current_frame = current_frame->previous;
}
return call_stack;
}
// Convert the array of addresses to names with the aforementioned MAP file.
Yay, now we know how to grab a stack trace from any location in the code. This implementation is not robust, but the concepts are correct: functions have names, functions live in memory, and we can determine which memory addresses are on the call stack. Now that you know how to manually grab a call stack, let Microsoft do the heavy lifting with the StackWalk64 function.
Next time, we'll talk about setting up your very own Microsoft Symbol Server so you can grab accurate function names from every version of your software.
- I love fighting anime and have seen every episode of Dragon Ball Z. I have seen all of One Piece (personal all-time favorite,) Bleach, and Naruto, and I stay current.
- I taught myself to use chopsticks both left- and right-handed.
- I own a Zojirushi rice cooker and eat a plain bowl of rice daily.
- I own a Zojirushi water boiler and use it frequently for oatmeal.
- Shin Ramyun are my favorite instant noodles and I keep my kitchen stocked.
- I spent three months in New York with my Fujian girlfriend (now ex) and her three Asian roommates.
- I teach myself Mandarin (slowly) and have a subscription to ChinesePod. I've memorized Liang Shan Bo Yu Zhu Li Ye and have sung it before during karaoke with my ex-girlfriend and a dozen or so of her Asian friends.
- I have been a paying member of crunchyroll for over two years.
- I have been drinking bubble tea almost daily for the past 3 years. I get most of it from Loving Hut, which is on the same block as my office in downtown Palo Alto, CA. I learned how to make it at home as well.
- I do Pincha Mayurasana daily, and can hold it for either two minutes or 10 pushups worth. Lesser known is that my inspiration to do so came from the character Zoro in One Piece.
^_^;Sweatdrops on foreheads and tidings from kittens;
Tigers a-crouching while dragons lay hidden;
Red paper envelopes brimming with bling; Asia is full of my favorite things.Rose-colored petals and climactic doodles?
Green tea and pearl tea and bowls full of noodles;
High kicks that fly from the legs of Bruce Lee;
Asia is full of my favorite things.Girls in school uniform, pigtails, and glasses
Saving the earth while they ditch all their classes;
Chinese and Japanese insanity;
Asia is full of my favorite things!When the dubs bite,
When the plot stinks,
When I'm feeling had,
I simply remember that Asia is king
And then I don't feel so bad!
There's just one problem: our program is going to reuse threads (imvu.task maintains a threadpool), but it has no upper limit on the number of connections it will open at the same time. Let's address that.
Instead of spinning off a new task for every web request, what we'll do is maintain a set of worker tasks to process incoming requests.
That looks like this:
from imvu.task import task
from imvu.task import Future, Queue
from imvu.task import Start, Return
class Pipeliner(object):
def __init__(self, jobCount=2):
self.__queue = Queue()
self.__workers = None
self.__jobCount = jobCount
@task
def start(self):
assert self.__workers is None
self.__workers = [(yield Start(self.__work())) for _ in range(self.__jobCount)]
@task
def schedule(self, workItem):
assert self.__workers is not None
f = Future()
self.__queue.put((f, workItem))
result = yield f
yield Return(result)
@task
def __work(self):
while True:
future, workItem = yield self.__queue.get()
try:
result = yield workItem
future.complete(result=result, error=None)
except Exception, e:
future.complete(result=None, error=e)
There are a bunch of things going on in this class. Let's look at them one by one:
First, we introduce a queue. imvu.task queues are just like a queue in any other language you may see, except that you can block and wait for something to be put into the queue. This is why Pipeliner.__work yields on self.__queue.get().
Secondly, and more importantly, we introduce the concept of a Future. A Future is an object that represents a return value that may not have been computed just yet. Futures are occasionally called promises for this reason.
This class makes use of two logical threads of execution: a worker task that continuously pulls work items off a queue, performs the work, and provides results; and a second task that schedules work items, waits for the result to materialize, and returns it.
Pipeliner.schedule is our public interface: client code will use it to schedule a bit of work, and yield on it until the work is complete. Its operation is quite simple: it creates a future, places it on a queue alongside the work item, waits for the result, and returns it.
Pipeliner.__worker is a simple consumer loop: wait until a work item has been enqueued, pop it off, do the work, and then provide the result to the future. The call to future.complete() provides the result to the future, which automatically wakes up any tasks that were blocked on it. Notice that complete() takes two arguments: a result and an error. This way, exceptions raised by tasks can be propagated to the tasks that are waiting on them.
Using this in our URL getter application is simplicity itself. Create it, start it (to get its workers fired up), and then use it!
pipeliner = Pipeliner(jobCount=2)
taskScheduler._call(pipeliner.start())
@task
def doLotsOfStuff(url):
all_urls.append(url)
content = yield pipeliner.schedule(getUrl(url))
file(filenameForUrl(url), 'wt').write(content)
urls_to_get = [
u for u in extractUrls(content)
if stripExtension(u) in extensions_we_want
]
yield RunInParallel(
doLotsOfStuff(u)
for u in urls_to_get
if u not in all_urls
)
Next time, I'll go into a bit of the mechanics behind futures, and show how to use that information to simplify our Pipeliner some.
I'm so excited that IMVU now supports OpenID that I wanted to post about it. Dusty has put up a simple forums post announcing it here.
What does this mean?
If you are an IMVU customer, you have an OpenID. The format is avatars.imvu.com/YourAvatarName. You can use this to log into any site that supports OpenID, some of which are listed here: http://openiddirectory.com/. I recommend LiveJournal as a great place to start - you can log in and post a comment to someone's journal using only your IMVU OpenID.
Why is this cool?
We hope IMVU developers of all kinds will use OpenID to make their IMVU-affiliated sites work without requiring you to register a separate account, and without you having to give them your IMVU password. We hope IMVU credits resellers will use this to make purchasing credits that much more secure, reliable, and convenient.
Warning: This list includes features that have not shipped. Some things that appear may never actually ship.
Groups (shipped to VIP users)
- Change the groups page to show only the most active members (top 6) panel; tests updated
- Add pagination and sorting to the view all members page
- Change how avatar names were stored in the database to make sorting memberlists by name easier
Rooms (shipped to test users)
- Add display of total size of products in product detail page
- WIP on page for inviting more people into rooms
Outfits and Outfit Contests (shipped to test users)
- Make the "yesterday's contest" text on the sidebar be a link to that contest
- Change outfits contest header to read 'daily outfits contest - vote now!'
- Display some basic information to explain what is going on on the contest voting page
- For past contests display slot for outfits even if the outfit has been deleted to prevent standings from getting rearranged
- WIP on system for issuing prizes automatically and tracking monthly standings
- WIP on outfits contest panel for the main IMVU home page
Bugs
- Fix bug where IMVU did not work with Stardock's WindowBlinds product
- Fix bug where the first three days of the "chat on first four days" rewards were getting paid out even if the user didn't chat on day 1. This caused confusion when the user did not get the big reward on the fourth day. Now explain better to users the rules for this reward program.
- WIP on group chat bugs (specifically refactoring system so that we can bring it under test)
Misc
- WIP on internal tool to track the top terms users are searching on in the catalog
- WIP on internal tool for monitoring economy by summarizing all transactions
- Migrate crash reporting and user monitoring to new database hardware
The updated source of this document is available at:
Source: http://durrett.net/MySQL_master_master_administration.html
Overview
For a production service you pretty much need at least N+1 on any critical server. Perhaps the most critical role is your database. Here are some tips to help configure MySQL for backups and fail over There are many approaches to this, so look around for what suits your needs.
The basic setup requires two servers that use Master-Master replication (each server considers the other it master). This means all writes to the master are relayed to the slave, so the slave will have an identical copy of the database. A master can have many slaves, but any slave can only have one master. A server can be both a master and a slave. You can use more that two servers, but I will stick with two to keep the examples simple.
Getting a Snapshot
Replication is pretty simple to setup but for some reason difficult to get right. One of the most annoying parts about setting up replication is getting a clean snapshot of the server... if you are starting with an empty database you save yourself this hassle. If not, you need an exact copy of the data, which can be done in a few ways but to get an exact copy pretty much every method will lock the database, which makes this a bad option in a production environment.
A pretty simple way to get a snapshot is with mysqldump. Make sure you include the "--master-data" options as this will add important replication info to the output. The following command should get a clean snapshot of your databases:
# mysqldump --add-locks --create-options --disable-keys --extended-insert --master-data --quick --lock-tables -A > snapshot.sql
If you are lucky enough to have all of the binlogs from the time you initially setup your database you can skip the snapshot process (see "Starting Replication", below).
MySQL Configuration
I am going to use "Server A" and "Server B" for my examples – these refer to the MySQL instance, most likely running on separate pieces of hardware (if not, I am not sure why you are reading this).
Of great importance, make sure Server A and Server B have different server ids in the my.cnf file. The server id is used to identify where a query originated and prevents replication from being an infinite loop (i.e. if, via replication, Server A gets a query that originated from its server id, it discards the update rather than executing the query).
So, Server A has a my.cnf like this:
server-id = 10
and Server B has a my.cnf like this:
server-id = 20
Other settings for the my.cnf file on both servers:
# The next two lines ignore replication queries for the mysql and
# test databases... you probably want this is you want to manage permissions
# on each server separately (I will identify the importance of this in the
# fail over section, later
replicate-ignore-db = mysql
replicate-ignore-db = test
# Make sure you have binlogs. Otherwise, when restoring, you will lose
# all data added since the backup snapshot
log-bin = /var/log/mysql/mysql-bin.log
# Normally slaves do not add the queries they receive via replication
# to their binary logs. If you want to restore from backups you will want
# these updates in the binlogs.
log-slave-updates
Make sure you restart your mysql instances after you change the my.cnf settings.
Loading the Database
For loading from a snapshot, I will assume that Server A is the source and Server B is the new server you are setting up. If you are not loading from a snapshot, jump ahead to "Starting Replication"
Take your snapshot from earlier and load it on Server B. Assuming Server B is a new install of MySQL with only default databases, this should work (assuming the root password is blank):
# mysql ["less than" symbol my blog keeps destroying] snapshot.sql
Once this completes Server B should be an exact copy of Server A at the time the snapshot was taken.
Starting Replication
Next you need to start replication. Once replication is started, all queries written to a master will be sent to the slaves, ideally keeping an exact copy.
Before a slave can replicate from a master, you need to grant permissions on the master. The following would grant replication to the mysql user 'rep' from any machine (please take the appropriate password / firewall precautions before you do this). As we will be running with a master-master setup, this needs to be done on both Server A and Server B:
mysql> GRANT REPLICATION SLAVE ON *.* TO 'rep'@'%';
Next you need to tell each slave how to connect to the master. If you took a snapshot, the information you need is in the snapshot.sql file. In the first 50 or so lines you should see a statement like:
--
-- Position to start replication or point-in-time recovery from
--
CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000321', MASTER_LOG_POS=8675309;
If you don't see this, you probably did not use the '--master-data' option when doing a mysqldump (bummer).
If you did not use a snapshot because you have all of the binlogs from the initial setup of the other database, you should assume that MASTER_LOG_FILE is the name of the oldest binlog on your system and MASTER_LOG_POS=4.
With that information in hand, you can now tell the slave how to connect to the master. So, on Server B, issue the following command:
mysql> CHANGE MASTER TO MASTER_HOST='[hostname or IP of Server A]', MASTER_USER='rep', MASTER_PASSWORD='[hopefully you set a password]', MASTER_LOG_FILE='[MASTER_LOG_FILE]', MASTER_LOG_POS=[MASTER_LOG_POS];
So, using my example snapshot.sql from above (and assuming Server A has an IP address of 192.168.99.10) this would be:
mysql> CHANGE MASTER TO MASTER_HOST='192.168.99.10', MASTER_USER='rep', MASTER_PASSWORD='sekrit', MASTER_LOG_FILE='mysql-bin.000321', MASTER_LOG_POS=8675309;
You can double check your work like this:
mysql> SHOW SLAVE STATUS \G
Once all of this is working, you can start the replication on the slave with the following command:
mysql> START SLAVE;
After that, do another:
mysql> SHOW SLAVE STATUS \G
If all went well you should see "Slave_IO_Running: Yes", "Slave_SQL_Running: Yes" and "Seconds_Behind_Master:" should be some numeric value, ideally "0". If all did not go well, it is beyond the scope of this document.
At this point Server B is a slave of Server A. Important: do not execute any queries on Server B that will result in an update. Now we want to make Server A a slave of Server B.
On Server B, execute the following:
mysql> SHOW MASTER STATUS;
You will use these values to setup replication on Server A. If you execute this more than once you may see the values are updating – this is fine so long as you are not executing update queries on Server B yet – the updating values reflect updates to Server A being replicated to Server B.
On Server A, execute the following:
mysql> CHANGE MASTER TO MASTER_HOST='[hostname or IP of Server B]', MASTER_USER='rep', MASTER_PASSWORD='[hopefully you set a password]', MASTER_LOG_FILE='[MASTER_LOG_FILE]', MASTER_LOG_POS=[MASTER_LOG_POS];
So assuming Server B has an IP address of 192.168.99.20, this might look like:
mysql> CHANGE MASTER TO MASTER_HOST='192.168.99.20', MASTER_USER='rep', MASTER_PASSWORD='sekrit', MASTER_LOG_FILE='mysql-bin.0000011', MASTER_LOG_POS=94703;
Again on Server A, start the slave, confirm it is working and pat yourself on the back:
mysql> START SLAVE;
mysql> SHOW SLAVE STATUS \G
Congratulations, you now have Master/Master replication working.
Using Your Databases
So you may think, cool... I now have twice the capacity on my database. Not really. For one, all writes done to one database also have to be done to the other. You may be able to take advantage of more read capacity, but if your application requires the read capacity of both servers, you don't have N+1 redundancy – when one server dies all reads will go to the remaining server, probably overloading it. But neither of these are the really big problem...
When using replication in MySQL, writes to multiple servers are not necessarily written in the order they were intended. For example, lets say you do an insert on Server A and, at the same time, you do an insert on Server B. If the insert was done to a table that has an auto_increment primary key, it is possible that both inserts will get the same id because they were not aware of the other insert at the time they were written. You can write your application to avoid this, but if you are starting with most off-the-shelf or open source solutions, they probably are not written to deal with this situation.
Once you have duplicate keys your replication will fail because the slave will not be able to perform the insert it received from the master. At that point your replication is inconsistent... very bad. For this reason, you probably never want to be writing to both servers at the same time.
So instead I recommend that you have one server setup as your "primary" and one setup as your "standby". Don't assign these names to your servers because they will switch during a fail over.. think of "primary" and "standby" as a pointer to the server. All reads and writes are sent to the primary server and your standby is used for getting clean backups without blocking your application and can become the primary when the primary server fails. I like to enforce this restriction with MySQL permissions... the application only has write permission to the primary server and read (or no) permission to the standby.
Backups
Backups are simple at this point – simply use the mysqldump example from above on your standby server. You can also stop the server and copy the binary database data – you are on your own though. During this backups your standby database will not update so it will fall behind the primary database (you can see how many seconds behind it is with the "SHOW SLAVE STATUS" command). This leaves you a little more vulnerable to failure during backups – if this is too risky, you can setup a third slave that is only used for backups.
Note: when using the "--master-data" command in mysqldump, the output is the pointer to the master data for the machine performing the dump, not for that machine's master! So if your standby server is Server B and you do a mysqldump on it, to restore from that snapshot you point to Server B as the master, not Server A. This is also why it is important to have "log-slave-updates" enabled in the my.cnf file – the master data from the backup snapshot needs to point at binlogs on Server B. If you are not logging slave updates, these binlogs will be empty because all writes on Server B are from replication.
So to recap, in order to restore this database with no data loss, you need both the snapshot and binlogs from the server you use for backups (your standby). If you just have your snapshot, you will lose all updates that occurred between the snapshot and the failure. When I update this document I will add some tricks to get around this.
Fail over
Fail over is relatively simple – make your application stop writing to the primary server and start writing to the standby server (and now think of "standby" as "primary" and "primary" as "standby". In practice there are a few precautions you may want to take:
*
If the primary database is accessible, disable write access to it to that you avoid the duplicate key issues. In many failure cases you will have to skip this step.
*
Before starting to write on the standby, make sure that it is '0' seconds behind the master. Otherwise you risk duplicate key issues.
*
If you have automated backup scripts, you may want to point them at the new standby... otherwise they will lock your new primary database when they kick-in.
Recovery
Recovery works pretty much exactly like the initial setup of the servers – load the snapshot and start replication. There is one huge gotcha that I don't usually see identified... If you need to do a full restore of what was previously your primary server you are likely to that you get errors in replication... this is because when replicating from the restore it is ignoring the updates it performed prior to its failure. Here is an example scenario:
- A snapshot is taken from standby Server B at 6:00 AM
- At 11:30 AM, primary Server A fails and we redirect application to Server B
- Server A is re-installed from scratch with the 6:00 AM Server B snapshot
- Server A is told to replicate from Server B starting at the log position of the snapshot
- Since all binlogs from 6:00 AM until 11:30 AM originated from Server A, it ignores these statements and actually starts replication at 11:30 AM, when Server B was the origin of the updates.
The quick fix (hack) is giving Server A a different server id until it catches-up to Server B and then it can get its old server id back.
In place of the (Sales: X) mark on a product's info page, I put a "Developer Reports" link, that developers can use to see lots of information.
Any developer "involved" in a product (meaning, any developer in the product derivation chain) can look at a product's reports.
Developers can also see listings of all their products, income events, etc...
This should be more useful than developer emails, so I have turned the developer emails off. Marcus has suggested that perhaps I add a weekly digest email sent to developers?
Check it out, everyone, and let me know what you think
forum post: http://www.imvu.com/catalog/modules.php?op=modload&name=phpbb2&file=viewtopic.php&p=148617#148617