Nuggets from 1996 CGDC

Nuggets From
1996 Computer Game Developers Conference

Phil Davidson

April 16, 1996

Disclaimers:

Items not explicitly mentioned at the conference are enclosed in [square brackets]. Uncertain items are marked by [?].
The sequence of items has been changed.
The spelling of some persons' names is uncertain.
Any errors or lack of clarity should be attributed to me.
CGDC audiotapes and videotapes are available from http://www.webcom.com/knowit/cgdc/welcome.html. .

Authoring Tools Hits and Misses -- seminar led by Ms. Jamie Siglar. She maintains the Usenet's FAQ on multimedia authoring tools at http://www.tiac.net/users/jasiglar/MMASFAQ.HTML.
Intel processor optimization principles. Intel preceded the CGDC with its own three-day seminar on its MMX extensions.

MMX is a set of new instructions to be implemented on future Intel processors which will become common in 1997. The MMX instructions are designed for efficient handling of certain operations that are common in multimedia processing. Some of the MMX instructions can handle up to eight bytes of data at once.

Sources of further information.

For details about MMX, see http://www.intel.com/pc-supp/multimed/mmx/index.htm.
For Intel optimization in general, see http://www.intel.com/ial/processor/index.htm.

For intensive low-level optimization (on any Intel processor), Intel's Vtune utility is a wonderful help.

It gives details about how adjacent instructions will dovetail in practice. It makes use of intimate knowledge of how the instructions delay one another.
It can perform statistical samplings on a running program, to identify the hot spots down to the level of individual instructions. This can help identify larger issues, such as limitations that come from data misalignment or from the CPU cache.

The tightest assembly-language loops are limited not by the CPU instructions but by other speed limitations: transferring data in and out of the CPU.

Four bandwidth values are relevant:

The speed of ALU (arithmetic/logical unit) operations within the CPU.
The speed of data transfers between the CPU and the L1 cache (closest to the CPU).

MMX operations (eight bytes at a time) bring the CPU's data rate up to the limit of the L1 cache. Therefore they're optimal.

The speed of the L2 cache.
The speed of transfers between the CPU and main memory.

The CPU's write buffer requires eight cycles (!) to complete any write. For fast loops, this will be the limiting factor. Therefore, if a loop is limited by this write time, then consider performing further computations on the data within the loop. That is, find something useful to do during stalls.
The data cache is loaded in 32-byte chunks called lines. It was suggested that we plan the contents of 8K of the cache. Large blocks of input data can be force-loaded into the cache by reading one byte every 32 bytes. There's no advantage to force-loading data from an output-only buffer.

Software Maturity: Do Game Developers Really Need It? -- lecture by Larry Constantine, famous software management consultant (of Constantine & Lockwood, Ltd., and the University of Technology, Sydney).

The Capability Maturity Model describes characteristics of different organizations in an attempt to understand their different success rates. It originated from Watts Humphrey of IBM and Carnegie-Mellon University.

The five levels listed in the model relate success, consistency, and government patterns in engineering teams.

Level	Nickname		Culture & Government	Where blame is placed
1	Heroic	Extremely good or bad results but unpredictable.	Creative, heroic. Formal commitments are not taken seriously.	Individuals
2	Repeatable	Average quality of results is lower than for level 1 but more consistent, hence results can be predicted.	Committed, stable, consistent.	Process
3	Defined	Follow defined procedures. Can get sidetracked and disregard the results in favor of adherence to procedures.	Engineering: shared norms, clear expectations and responsibilities.
4	Managed	Measure results and evaluate performance	Engineers are trusted to manage the process.
5	Optimizing	The process includes improvement of the process. There is a structure for feedback. The aim is to prevent defects from occurring in the first place.	Self-managed project teams, empowered to innovate and to change the process.

It can take one to five years to evolve an organization from Level 1 to Level 5.
Benefits of pursuing higher levels (as measured in studies).

More product errors are detected.
More product errors are detected at an earlier phase of development, when they are cheaper to fix.
Delivery time is improved.
Resulting savings generally exceed cost by a factor of five.

Some problems with the advice.

The path to improvement looks more straightforward than it really is.
Improvement requires discipline and commitment, expressed partly by allocation of money.
Participants tend to give the levels too much importance in themselves.
Sometimes assessment is undertaken and completed but no changes are instituted.
The sequence of levels is improperly understood as a rigid progression.

Methodology, metrics, and models in general.

Methods and methodology.

Methodology is just a fancy synonym for methods.
Methods are neither magic nor mediocrity.
Good methods are just folklore (stories and guidelines) in book form.
Good methods are just a description of what good developers do.
Many methods are like training wheels. One uses them until one knows better. The use of training wheels is not the same as real work: the training wheels change the experience.

Metrics.

Metrics give a sense of measurement that makes sense in this context.
Metrics reveal whether actual benefits are occurring.

Models.

Models and diagrams are sometimes an easy way to inflate reports.
Models can be a simplifier to help manage complexity.
Models make it possible to employ intuition and nonverbal pattern-perception abilities.
It's easier to talk about models than to actually build computer programs.

Larry's alternative recommendations.

Good practices are techniques that are known to work (regardless of one's management theory). They include:

Commenting source code.
Code inspections. This has been measured to reveal bugs much faster, and with more accurate low levels of confidence, than testing reveals.
Code walkthroughs.
Project planning.
Good code architecture.

Fitting with existing corporate structures and cultures.

Here are some common cultures in development organizations. To promulgate new practices, introduce new knowledge in a way that's compatible with the existing culture. [This chart omits some examples that he mentioned.]

	Where do the practices live?	How are practices enforced?	Where is the knowledge base?	Where are the decisions made?	How can new practices be spread?
	Individuals	Self-discipline.
Informal culture		Group pressure	Folklore, stories		Promulgate new stories
Groups following defined methods.		Inspections and reviews.			Define a new method.
Military-type hierarchy	Authority structure.	Standards and audits.	Regulations.
Profession	Professional standards	Research, theory, professional licensing

"Mature" practices should fit with the people and how they already do things. They should be good fun as well as good work.
[Comment by an attendee: Maybe different parts of an organization could have different cultures of governance, depending on personalities and circumstances.]

The Quake Graphics Engine -- lecture by Michael Abrash, id Software. Quake will be the successor to id's Doom. [I expect that he'll eventually publish this material in article and book form.]

General comments.

The knowledge base required to master 3D graphics programming is much larger than with 2D. It takes about twelve months to acquire.
The Quake team built about five or six different engines before they understood what they needed to build. If they had first known that, they would have taken only a month or two.
Personal computers still are not fast enough for ideal performance. They set 10 to 15 frame per second as their lower limit on speed. This meant they would need to limit the richness and complexity shown on the screen.

Objectives for the Quake graphics engine.

Leapfrog the capabilities of Doom.
True, arbitrary, six-degrees-of-freedom 3D behavior. True 3D appearance for as many objects as possible. No sprites.
Highest quality. Stability of image as the point of view changes.
Correct color at every pixel.

The basic problem.

Theoretically, one should be able to sample the correct pixel color from the nearest polygon in the scene, and that would be enough.

In practice, with a simple-minded algorithm, the frame rate varies radically depending on the scene.
No present software rasterizer can provide an adequate frame rate (no worse than 10-15 frames per second). Even if one were adequate, the scene designers would increase the scene complexity and the frame rate would decline.

Two major parts to the task:

Quickly reduce the set of polygons to the relevant ones. This is becoming the real technical challenge.
Draw the right pixels from the polygons: Z-ordered, shaded, with subpixel and subtexel accuracy. Rasterization as a software issue is disappearing as hardware improves.

Part one of culling the polygons: the static world of Quake (walls, floors, ceilings). They had around 10,000 polygons lit by arbitrary fixed light sources.

They combined their entire static world into a single continuous skin.
They preprocessed their world into one big BSP tree.
They would clip away the BSP nodes that were totally outside the view pyramid.

Part two of culling the polygons: discarding polygons that are within the view pyramid but are obscured by walls within the scene.

They tried Z-buffering.
They tried edge or span sorting.
They tried using a beam tree.
Some problems arose in the case of portals (holes in a wall).
The solution was to precalculate the potentially visible set (PVS) for each level. That is, for each leaf in the BSP tree, calculate which other leaves it could potentially see.

This resulted in about 20K of data for each level.
The calculation had limited accuracy: perhaps 50% more leaves were recorded as potentially visible than were actually visible.
It was difficult to get the precalculation correct.
Having this information also speeds the drawing of moving objects.

The size of the level becomes relatively irrelevant, because unseen polygons do not matter.

Avoiding overdraw. At this point, they still averaged 150% of screen pixels drawn (internally) (that is, 50% overdraw). What was worse, depending on the location, overdraw ranged from 0% to 500%, leading to some unacceptably low frame rates.

Working from front to back, add BSP node edges to a global edge list.
The natural BSP order automatically tells which edges are foremost. The BSP node number contains this information.
Walk the scan lines across the screen, working out what to show. The result was zero overdraw.
The edge list cost 10% overhead, but helped in worst-case scenes.
Shared edges were also detected [?].
Concave polygons provided some benefits, being bigger and fewer [?].
This scheme [?] also reduces the number of polygons sent to the 3D hardware.
On what key should the edge list be sorted?

They tried sorting on 1/z, but this lost the BSP partitioning.
Eventually they sort based on the BSP order. (Lesson: BSP trees contain more implicit useful information than you might think.)

Rasterization.

Issues.

Gouraud shading needs triangles or it will not move correctly. [Does that mean that the colors will shift improperly as the object is moved?]
To light the details requires more polygons.
Lighting is not perspective correct.

Solution: surface caching. Precalculate texture lighting on an offscreen buffer.

The final texture-map stage therefore requires no shading.
The inner texture-draw loop requires only 7.5 cycles per pixel.
Fewer different textures are required.
The texture cache requires from 500 kilobytes to 1 megabyte. (Quake requires an 8-megabyte computer).
Textures can be mip-mapped [?].
No rotational variance results.
The resulting lighting is perspective-correct.
Fewer polygons are required.
Any necessary postprocessing can be done on the surface cache before the texture is mapped to the object.
This strategy requires more memory.
The surface cache is too big to fit into the CPU cache.
Individual surfaces can require up to 64 kilobytes [?].
If the lights were to change, it would be slow, because the entire surface would need to be rebuilt. (Quake doesn't do dynamic lighting.) [I think he said there might be a way to fix this.]
When turning the corner into a new room, there is a slowdown as the surfaces are built. (It might be possible to display the first frame with reduced resolution.)
This technique is not a good fit for current 3D hardware, whose texture sizes are limited.

Final step: draw the scene.

At this point, they have a list of 8- or 16-pixel spans and a span drawer.
Their rasterizer is 100% floating point, down to 8- or 16-bit subdivisions [?].

Performance suffers on 486 processors.
On Pentiums, FDIV instructions can overlap with other instructions.

Moving entities.

Four types of representation are used.

More complicated, flexible objects with relevant details are polygon models.

General characteristics.

They range from 50 to 400 triangles.
Some have many frames. Each frame takes up to 500 bytes (not as bad as you might think: they are just vertices).

Implementation notes.

One skin per moving entity.
They couldn't be clipped.
The triangles were drawn with a separate affine rasterizer.
They are Gouraud shaded.
They will be lit dynamically.
Integer only.
Bucket-sort on 1/z batches.
The edge list is good up to 200 polygons [?].

Rectangular objects like boxes, doors, and platforms are BSP models.

These BSP trees get clipped into the world BSP tree and added to the global edge list.
The overdraw prevention is particularly useful for doors.
Within the same BSP leaf, boxy objects are sorted on 1/z. There aren't usually shared edges [?].

Sprites are used for flames. Up close, they don't look 3D.
Particles (and blood) are scaled n-by-n sprites. They have a distinctive behavior.

Z-buffering was performed on the three kinds of non-BSP moving entities (polygon models, sprites, and particle systems).

This prevents sorting errors.
The z-fill entails a 10% cost.
This allows postprocessing: smoke can be stamped on an image at a late stage [?].

The precomputed PVS data is useful for large-scale culling of movable entities.

Level-of-detail (LOD) data is not enough [why is this relevant here?].

Not yet implemented: moving lights.
Conclusions.

An amazing number of techniques are available. Each technique has its strengths and weaknesses.
Try to precalculate and cache as much relevant information as possible.
First simplify the coding and make the data handling uniform (as in the z-buffering applied to the non-BSP moving entities), then optimize the code.

References.

Zen of Graphics Programming, by Michael Abrash.
___________, by Bruce Naylor (reigning expert of BSP trees).
Procedural Elements for Computer Graphics, David Rogers (McGraw-Hill).

How to Appeal to the Online Gamer -- lecture by Daniel Goldman, CEO of Total Entertainment Network (TEN) (daniel@ten.net) (http://www.ten.net/).

What is different about online gaming?

Customers interact with one another.

Players over age twenty do not like losing to young kids.
Word of mouth is important.

Revenue is based on usage or number of visits.
Game intelligence is centralized, hence not limited by the player's machine.
Persistent game worlds are possible. They can continue to change over time.
Game worlds can be linked to the real world, for example, to real Web pages.
Products retain their interest longer. Players even sometimes return after burning out.
Customers can provide content.

Do you (the game maker) want to do all the work of providing the infrastructure for online gaming?

Customer service.
CDC [customer data center?], guides.
Data security.
Physical security for the headquarters.
Networking and server code.
Work with OSPs (online service providers) and ISPs.
Billing and tracking.
Create the place and the service. (This is more important than the game [he says].)

Areas in which the game service can contribute to the game experience.

Player rankings (but not including beginners).
Tournaments, viewers, and prizes.
Persistent messages. Game history.
Guides, including sysops, hints -- even volunteer guides.
Player matching.
Player handicapping to permit more possible player matches.
Spectator mode (for example, for beginners or during tournaments). (However, cheating becomes possible, as when a spectator reveals a player's poker hand.)
Design of a player's personal identity [that is, avatar].

Miscellaneous points, questions, and answers.

Game worlds are building incrementally toward soap operas.
Broadband connectivity is inevitable, but the rate of adoption is uncertain.
Voice communication is extremely important. TEN's API will include voice support.

It's important to provide a way for a player to disguise his or her voice.

Exclusive relationships between game makers and online services are important.

There are more reasons for partners to contribute and concentrate on the success of the game.
Brand equity is better protected.
When partnering, consider the prospective partner's long-term goals.

Eventually the market will concentrate to about three national online service providers.
What about the nongamer market?

There is a big market: cribbage, chess, "You Don't Know Jack," SimCity, etc.
Big hits are harder to win.

How can low latency be accomplished?

Use only selected ISPs and obtain priority handling for your transmissions.
Offer to reconnect a game player through another service provider.
TEN uses the Concentric network (a national ISP), which uses AT&T's frame cloud.

Regular game tournament nights attract spectators (who would ordinarily be watching TV).
How many people can play a structured game at once?

The limit is the bandwidth to the clients [the players].
If more than about 150 people play, then they tend to divide into groups. 150 is about the most that one player can keep track of.

During 1996 TEN will provide 21 games online, of which two have a persistent environment. They are seeking more game contracts.
What about free trial offers to attract new subscribers? TEN believes in this. (It also induces prospective users to judge the quality of the product.)
Retail products: A game need not achieve success as a retail product before it is released online. If it's first popular online, the "buzz" of public discussion will benefit its retail release.
During long play sessions, allow a natural point in the game at which to sign off.

Design Issues for Online Virtual Communities and Playgrounds -- lecture by Ben Calica, who "is responsible for Apple's Game Technology strategy."

Goals.

Excuses for players to spend time on-line, partying together.
Perceived player benefits to offset the shock of the first monthly bill.

What people enjoy.

Making friends
Chat (a compelling and addictive activity).

Chat bullies can scare people away.

Hope of online dating.
Revealing themselves to other players: often they reveal intimate truths or concoct a great lie.

Self-written biographical summaries are very cool. Make it an automatic part of the sign-in process.
Online romantic correspondence resembles romantic correspondences of pre-electronic eras.
New users disillusioned by a lie [someone else's lie or their own?] tend never to return.

Incentives for players to keep coming back (especially after they receive their first monthly bill).

Regulating offensive conversation.

On the ImagiNation Network, when a comment arrives from another player, there are buttons to reply, to mute, or to complain.

"Mute" informs the other player that someone has muted him/her. You will not be able to hear from him/her again [at least during this game, presumably].
"Complain" immediately reports the remark to a supervisor. The supervisor can view the recent conversations and can expel the offending player.

The truth of someone's self-revelation can be validated by the voluntary exchange of a real phone number.
Vigilantism.

On AOL's Neverwinter Nights, some bullies would claim unused rooms and prey on newcomers. Other experienced players banded together to teach and to protect newcomers, advertising their services.

Virtual rewards.

Possible specific rewards. (The goal is for monthly bills to betoken progress toward a goal, not just a financial burden.)

Pseudo-money, such as Worlds Away's tokens. (After friendship, this is the most effective incentive.)
Access to forbidden areas. (Making them forbidden automatically increases their appeal.)
Cool objects, provided that that their number is limited.
Space allocation, provided that space is limited and that it must be earned.
Building materials for one's personal part of the game world.
New abilities (flying versus walking).
Greater abilities to refine one's online appearance.

Ben likes appearances that are built by assembling a variety of pieces, like Mr. Potato Head.

Gain the ability to design and to build new parts of the online world.

Principles to govern rewards.

Abide by your own rules. (In Habitat, one player killed the monster, but unexpectedly acquired the monster's gun. Game management attempted to revoke the gun but players protested. Eventually they "bought" the gun in free trade for benefits useful in the game.)
Don't give out all your incentives too early.

Multiplayer dramas.

One approach is vactors: real actors hired to play the major roles.
Let the users be the actors for each other.
Hero-based stories don't work.

Most players want to play a major character who doesn't get stuck off-stage.
Most people are poor actors.

Allow players to play bad-guy parts, too.
See Kilobyte, by Piers Anthony.

Ways to make money.

Charge hourly for connect time (but give a reward for time spent on line).
Charge for the software.
Sell advertising space.

Billboards and product placement in the game world.
Display ads while software is being downloaded.

=== End of Nuggets From 1996 CGDC ===

This page is http://www.PhilDavidson.com/tech/fora/gamsig01.htm .

[Up to Technical Information]
[All the way back to Phil's home page]

Phil Davidson / Phil@PhilDavidson.com / Last modified 30 September 1999