If you create, build and run the sample template you end up with an iPhone application that spins a multicolored square. Let's dive in and see if I can make sense of this.

The critical file is EAGLView.m, so let's take a look at how it's defined.
/*
This class wraps the CAEAGLLayer from CoreAnimation into a convenient UIView subclass.
The view content is basically an EAGL surface you render your OpenGL scene into.
Note that setting the view non-opaque will only work if the EAGL surface has an alpha channel.
*/
The comment refers to CAEAGLLayer which the iPhone docs state is the canvas in which OpenGL ES drawing occurs. It also says that this is displayed by CoreAnimation--my guess then is that CoreAnimation is responsible for doing at least some of the graphics "heavy lifting" for OpenGL ES. I could be wrong, but I'm going with that until something else proves me wrong. Does it matter? Probably, if I"m interested in performance issues, but at this stage I'm just trying to wrap my mind around what is going on.
AGLView.h says that this is a plain UIView. No protocols or delegates; nice and simple. All of the properties are private. The first thing we really see that is new are the two dimensions
/* The pixel dimensions of the backbuffer */
GLint backingWidth;
GLint backingHeight;
Ok, the first question is, what is a "backbuffer"? I have a little basoc animation-programming background, so I have a guess. I'm willing to guess that the backbuffer is the offscreen buffer we draw to while the actual screen view displays a completed rendering. We do this in order to not show the screen actually drawing. So we "blit" the backbuffer to the main view in order to animate a frame.
The actual dimensions are delared as GLint. Obviously, these are integers, but why declare them as GLint? According to the spec, GLints datatypes are not "c" datatypes, and a GLint is a 32bit integer. Does it matter from the perspective of our work? Probably not. It's sufficient that we know we just use the GL-specific datatypes rather than the iPhone or "c" datatypes. So we've just defined variables to hold the height and width of the backbuffer. What if we don't want to do animation? We can probably get rid of the backbuffer if all we want to do is create a static screen using OpenGL ES.
The next property is defined as:
EAGLContext *context;
The "context" is the environment to which we draw, and everything we draw uses a context. The iPhone docs say we have to bind/attach a framebuffer before we can actually use the context. So how is a backbuffer different from a framebuffer? A buffer is just a glob of memory. The backbuffer is basically a framebuffer that will be displayed--think of it as the next framebuffer. The spec says specifically that a framebuffer is related to the graphics hardware. Do I have to worry about this? Probably--but not from the point of learning. I'll probably have to figure out whether/how the iPhone graphics chips deal with framebuffers if I want optimal performance, but for right now, that's just a point of interest.
The next bit:
/* OpenGL names for the renderbuffer and framebuffers used to render to this view */
GLuint viewRenderbuffer, viewFramebuffer;
uses the term "renderbuffer." A "renderbuffer" is just the buffer that is being drawn--it is a 2D pixel image. Why is this important? A quick browse of OpenGL on the web and even the EAGLView.m file shows that OpenGL uses matrices (arrays) to describe an object's vertices (points). So a lot of OpenGL ES is based on vector graphics, not solid images. The solid images come from rendering (drawing) a vector image and then "filling it in". A vector image is a based on lines and points. But the spinning square is made up of colored pixels. It seems obvious we need to somehow convert the vector representation to a bitmap one. The renderbuffer is the result of that transformation, is my guess.
A vector graphics cube
A bitmap cube.
Now that we have a "gut feel" for these buffers, the next thing to note is that they are declared as GLuints--integers again. So these must be pointers--but they don't use the 'c' * pointer syntax? Interesting if true--I guess we'll see how these variables are used.
/* OpenGL name for the depth buffer that is attached to viewFramebuffer, if it exists (0 if it does not exist) */
GLuint depthRenderbuffer;
This appears to describe the bit-depth of the renderBuffer. But the comment refers to viewFrameBuffer--confusing. Is it one, or the other or both? I'm willing to bet its both. After all, following my logic (which I admit can be totally wrong since I'm learning this for the first time) both buffers get displayed on the iPhone, so its a good idea to have a consistent bit-depth for every buffer that gets displayed. depthRenderBuffer will probably be used to hold a number that describes the max number of bits used to hold color information (maximum number of colors).
Finallly we have two Cocoa properties:
NSTimer *animationTimer;
NSTimeInterval animationInterval;
Nothing fancy here, a timer and a timer interval that I bet will be used to drive the animation by triggering how fast (frequently) the frames flip.
We then have the property and method declarations:
@property NSTimeInterval animationInterval;
- (void)startAnimation;
- (void)stopAnimation;
- (void)drawView;
Nice understandable names.
That's it for part one. Let me know what you think, or if I'm totally off track; I'm sure I've made some incorrect assumptions along the way, but as I said this is my attempt at learning. You learn by making mistakes.

