Why use output buffering in PHP?

Question

I have read quite a bit of material on Internet where different authors suggest using output buffering. An interesting thing is that most authors argument for its use only because it allows for mixing response headers with actual content.

I think that Web applications crossing certain size / complexity threshold should not mix generating headers and content, and that the developer should rightfully suspect potential faults in applications that attempt to send headers after body has been generated.

This is my first argument against the ob_* output buffering API. Even for that little convenience you get - mixing headers with output - I believe it isn't worth it, unless one simply is rapidly "hacking" or "prototyping" scripts.

Also, I think most people dealing with the output buffering API do not think about the fact that even without the explicit output buffering enabled, PHP in combination with the Web server it is plugged into, still does some internal buffering anyway. It is easy to check - have a script echo some short string, then sleep for say 10 seconds, then do another echo. Go to the script URL with a Web browser and watch the blank page pause for 10 seconds, with both lines appearing thereafter. Before some say that it is a rendering artefact, not traffic, tracing the actual traffic between the client and the server shows that the server has generated the Content-Length header with an appropriate value for the entire output - suggesting that the output was not sent progressively with each echo call, but accumulated in some buffer and then sent on script termination.

This is one of my gripes with explicit output buffering - why do we need two different output buffer implementations on top of one another? May it be because some of it is subject to conditions a PHP developer cannot control, so another means to control it is put into PHP?

In any case, I for one, start to think one should avoid explicit output buffering (the series of ob_* functions) and rely on the implicit one, assisting it with the good flush function, when necessary. Maybe if there was some guarantee from the Web server to actually commit output to the client with each echo/print call, then it would be useful to set up explicit buffering - after all one does not want to send response to the client with some 100 byte chunks. But the alternative with two buffers seems like a somewhat useless layer of abstraction.

With all this in mind, why use output buffering?

I'm so impressed that the first answers came in about 3 min. after the question was asked. That's some speedy reading! — troelskn, Commented Jan 27, 2010 at 15:46
@Chacha102: and @troelskn: Wow, the Internet has really destroyed your ability to read, hasn't it? It's really not that much to read. And in my opinion, a "wall of text" doesn't feature such nice things as paragraph breaks. I hate to give you two (and the upvoters) a hard time, but we should be praising people who take the time to elaborate on their questions rather than mocking them. If your attention span is that short, why respond? — eyelidlessness, Commented Jan 27, 2010 at 15:56
I thought Stack Overflow was for questions with answers, not debates...? — Martin Bean, Commented Jan 27, 2010 at 16:03
I do write perhaps a bit too much, you are right. To my defense I would say I would rather over-explain myself in a single question than spam the question list with a question that is too vague and needs clarification. In any case, Chacha102, I am sorry you have wasted 2 minutes of your life. Better sense of judgement next time, after all no one asked you to read my wall of text. — Armen Michaeli, Commented Jan 27, 2010 at 16:05
@eyelid, Given that I've read it twice might lend to the fact that I don't have a short attention span. I didn't mean the comment to be derogatory, but obviously you aren't able to detect sarcasm or humor on the internet. Maybe your ability to detect humor has been destroyed by the internet. — Tyler Carter, Commented Jan 27, 2010 at 16:10

David Bullock · Accepted Answer · 2013-07-05 06:09:44Z

Yes

Serious web applications need output buffering in one specific situation:

Your application wants control over what is output by some 3rd-party code, but there is no API to control what that code emits.

In that scenario, you can call ob_start() just before handing control to that code, mess around with what is written (ideally with the callback, or by examining the buffer contents if you must), and then calling ob_flush().

Ultimately, PHPs' ob_functions are a mechanism for capturing what some other bit of code does into a buffer you can mess with.

If you don't need to inspect or modify what is written to the buffer, there is nothing gained by using ob_start().

Quite likely, your 'serious application' is in fact a framework of some kind.

You already have output buffering, anyway

You don't need ob_start() in order to make use of output buffering. Your web-server already does buffer your output.

Using ob_start() does not get you better output buffering - it could in fact increase your application's memory usage and latency by 'hoarding' data which the web-server would otherwise have sent to the client already.

Maybe `ob_start()` ...

... for convenience when flushing

In some cases, you may want control over when the web-server flushes its buffer, based on some criteria which your application knows best. Most of the time, you know that you just finished writing a logical 'unit' which the client can make use of, and you're telling the web-server to flush now and not wait for the output buffer to fill up. To do this, it is simply necessary to emit your output as normal, and punctuate it with flush().

More rarely, you will want to withhold data from the web-server until you have enough data to send. No point interrupting the client with half of the news, especially if the rest of the news will take some time to become available. A simple ob_start later concluded by an ob_end_flush() may indeed be the simplest and appropriate thing to do.

... if you have responsibility for certain headers

If your application is taking responsibility for calculating headers which can only be determined after the full response is available, then it may be acceptable.

However, even here, if you can't do any better than deriving the header by inspecting the complete output buffer, you might as well let the web-server do it (if it will). The web-server's code, is written, tested, and compiled - you are unlikely to improve on it.

For example, it would only be useful to set the Content-Length header if your application knows the length of the response body after before it computes the response body.

No panacea for bad practices

You should not ob_start() to avoid the disciplines of:

opening, using and quickly closing resources such as memory, threads and database connections
emitting headers first, and the body second
doing all the calculations and error handling you can, before beginning the response

If you do these, they will cause technical debt which will make you cry one day.

salvador · Accepted Answer · 2017-04-12 16:57:17Z

9

Ok, here is the real reason : the output is not started until everything is done. Imagine an app which opens an SQL connection and doesn't close it before starting the output. What happens is your script gets a connection, starts outputting, waits for the client to get all it needs then, at the end, closes the connection. Woot, a 2s connection where a 0.3s one would be enough.

Now, if you buffer, your script connects, puts everything in a buffer, disconnects automatically at the end, then starts sending your generated content to the client.

edited Apr 12, 2017 at 16:57

salvador

991 gold badge2 silver badges10 bronze badges

answered Jan 27, 2010 at 15:42

Arkh

8,41241 silver badges46 bronze badges

3

Eh!? Something is seriously wrong with your page if it takes 2 seconds to just load the structure. Also having open connection to DB has almost no impact. What matters is how often you make those connections and what you do with them. -1
– Maiku Mori
Commented Jan 27, 2010 at 16:03
Again, a good reason, thanks. On the other hand, you can choose - a shorter database connection timespan or more progressive user response feedback. I would say, and this especiall applies to queries with big results, progressive user feedback wins over. Unless you hit connection limit, of course, which IS a HARD limit.
– Armen Michaeli
Commented Jan 27, 2010 at 16:15
Thanks for the votedown, but yeah, some pages can be quite heavy and necessitate a "long" time for the browser to download it (lot of data with embedded javascript things for example, a 2MB page is not exceptionnal). Time during which you have at least one useless open database connexion (if not multiple, different file handlers which prevent other scripts or apps to edit these files etc.).
– Arkh
Commented Jan 27, 2010 at 16:17
Amn, sure progressive user feedback is good. But if your script is a webservice asked for a big report, you've got to do it. Just to put things into perspective. To load this page, firebug is saying my firefox is wasting 515ms in the download phase (not the request send and waiting time).
– Arkh
Commented Jan 27, 2010 at 16:19
1

You have a valid point, I marked your answer as useful. I would think however, that when one wants to fetch query results and free the connection fast, a better solution is to offload the results in internal PHP variables (arrays, lists, etc), which may or may not have anything to do with output. After that you close the connection and start to send the formatted data to the client, buffered implicitly. It is your argument, just a bit modified.
– Armen Michaeli
Commented Jan 27, 2010 at 16:41

| Show 3 more comments

Daniel Vandersluis · Accepted Answer · 2010-01-27 15:42:05Z

7

If you want to output a report to the screen but also send it through email, output buffering lets you not have to repeat the processing to output your report twice.

answered Jan 27, 2010 at 15:42

Daniel Vandersluis

93.5k23 gold badges171 silver badges156 bronze badges

That's one good reason. Thanks. Missed it, did it in one script a while ago, albeit not with emails, but post-processing data before it actualle was sent to client, like fixing URLs in a HTML text.
– Armen Michaeli
Commented Jan 27, 2010 at 16:11
It's something that we do a lot of at work because we've got a lot of report applications where the user wants the output for posterity. Saves on duplication, obviously. :)
– Daniel Vandersluis
Commented Jan 27, 2010 at 18:44

Add a comment |

eyelidlessness · Accepted Answer · 2010-01-27 15:45:17Z

The most obvious use cases are:

An output filter (eg ob_gzhandler or any number of filters you could devise on your own); I have done this with APIs that only support output (rather than return values) where I wanted to do subsequent parsing with a library like phpQuery.
Maintenance (rather than rewriting) of code written with all the problems you discuss; this includes things like sending headers after output began (credit Don Dickinson) or suppression of certain output that has already been generated.
Staggered output (credit here to Tom and Langdon); note that your tests may have failed because it conflicts with PHP/Apache's default internal buffer, but it is possible to do, it simply requires a certain amount to be flushed before PHP will send anything—PHP will still keep the connection open though.

Don Dickinson · Accepted Answer · 2010-01-27 15:32:32Z

3

i use output buffering for one reason ... it allows me to send a "location" header after i've begun processing the request.

answered Jan 27, 2010 at 15:32

Don Dickinson

6,2283 gold badges35 silver badges30 bronze badges

1

What would "processing the request" mean here? Sure, you can process, as long as you don't send any data before headers. I am still not convinced on a good reason to send location header after sending data. Maybe there exists a corner case?
– Armen Michaeli
Commented Jan 27, 2010 at 16:08
processing the request might mean sending some sql data back to the client or perhaps setting a cookie. later in the call, an sql call fails. you may not want the user to see the first stuff ...the cookie or the other sql data. in that case, canceling the buffer and sending a location to a generic error page or whatever might be necessary. in theory it would be great to not send data, but you never know when an error will occur and there may be very good reasons to NOT send the buffer to the client.
– Don Dickinson
Commented Jan 27, 2010 at 18:25
5

The bottom line is that you shouldn't be sending data to output while there are still evaluations to be made. All conditions should be evaluated before output is ever composed. This is the beauty of an MVC architecture (well, the VC part of it, anyway)--the architecture makes it easy to isolate logical operations and to order things in steps, of which sending to output is the last.
– Brian Warshaw
Commented Jul 23, 2012 at 18:10

Add a comment |

Jon Benedicto · Accepted Answer · 2010-01-27 15:38:23Z

3

Output buffering is critical on IIS, which does no internal buffering of its own. With output buffering turned off, PHP scripts appear to run a lot slower than they do on Apache. Turn it on and they run many times faster.

answered Jan 27, 2010 at 15:38

Jon Benedicto

10.6k3 gold badges29 silver badges30 bronze badges

1

Can you provide any proof of this beyond the fact that it "seems" slower?
– Langdon
Commented Jan 27, 2010 at 15:57
1

@Langdon, to be fair, perceived performance definitely is real performance, from a user's perspective.
– eyelidlessness
Commented Jan 27, 2010 at 16:00
I could, but I don't want to restart my IIS server right now :-) Try running phpBB3 in IIS with and without output buffering, and use Firebug to note the time it takes to receive the HTML.
– Jon Benedicto
Commented Jan 27, 2010 at 16:07
I have no extensive experience with IIS. What you imply is that the server sends chunks of output data with each 'echo' etc, without 'Content-Length'? In that case indeed it is better to have SOME output buffering than none at all, especially if sending short strings, I guess. However, I was argumenting against the type of cases where both output buffering schemes work in parallel, which seems to be a waste.
– Armen Michaeli
Commented Jan 27, 2010 at 16:10

Add a comment |

nalply · Accepted Answer · 2012-08-03 21:46:21Z

It's an old question but nobody said that an important feature of outbut buffering is filtering. It is possible to preprocess the buffer before sending it to the client.

This is a very powerful concept and opens many intriguing possibilities. In a project I used two filters simultaneously:

ad-hoc translation of terms (replacement of short texts)
obfuscation of HTML, CSS and Javascript (don't ask me why)

To enable output filtering call ob_start("callback") where callback is the name of the filtering function. For more details see PHP's manual for ob_start: http://php.net/manual/en/function.ob-start.php

Don · Accepted Answer · 2010-01-27 15:44:29Z

2

I use output buffering in order to avoid generating HTML by string concatenation, when I need to know the result of a render operation to create some output before I use the rendering.

answered Jan 27, 2010 at 15:44

Don

4,6531 gold badge27 silver badges33 bronze badges

Add a comment |

Tom · Accepted Answer · 2010-01-27 15:31:34Z

1

We used to use it back in the day for pages with enormously long tables filled with data from a database. You'd flush the buffer every x rows so the user knew the page was actually working. Then someone heard about usability and pages like that got paging and search.

answered Jan 27, 2010 at 15:31

Tom

22.7k5 gold badges65 silver badges96 bronze badges

Well, that is implicit output buffering, right? You can do just fine without ob_ family of functions, no matter the length of data, and regardless usability and paging.
– Armen Michaeli
Commented Jan 27, 2010 at 16:06

Add a comment |

Langdon · Accepted Answer · 2010-01-27 15:36:47Z

0

It's useful if you're trying to display a progress bar during a page that takes some time to process. Since PHP code isn't multi-threaded, you can't do this if the processing is hung up doing 1 function.

answered Jan 27, 2010 at 15:36

Langdon

20k18 gold badges88 silver badges107 bronze badges

Ehm, and I have tested this, you can do exactly that without output buffering API?! Echo your progress message at the beginning of a lengthy operation, call 'flush', and start your lengthy processing. The server will switch to 'chunked' transfer encoding, your progress message will already be at the client end, while your script is still executing.
– Armen Michaeli
Commented Jan 27, 2010 at 16:28

Add a comment |

CoCoMo · Accepted Answer · 2010-11-15 23:00:17Z

0

Use output buffering to cache the data in a file, for other similar requests if you are doing a lot of database transactions and processing.

answered Nov 15, 2010 at 23:00

CoCoMo

2791 gold badge4 silver badges10 bronze badges

Add a comment |

Collectives™ on Stack Overflow

Why use output buffering in PHP?

11 Answers 11

Yes

You already have output buffering, anyway

Maybe `ob_start()` ...

... for convenience when flushing

... if you have responsibility for certain headers

No panacea for bad practices

Not the answer you're looking for? Browse other questions tagged
php
http
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

11 Answers 11

Yes

You already have output buffering, anyway

Maybe ob_start() ...

... for convenience when flushing

... if you have responsibility for certain headers

No panacea for bad practices

Not the answer you're looking for? Browse other questions tagged phphttp or ask your own question.

Linked

Related

Maybe `ob_start()` ...

Not the answer you're looking for? Browse other questions tagged
php
http
or ask your own question.